summaryrefslogtreecommitdiff
path: root/pod/perli18n.pod
diff options
context:
space:
mode:
Diffstat (limited to 'pod/perli18n.pod')
-rw-r--r--pod/perli18n.pod234
1 files changed, 0 insertions, 234 deletions
diff --git a/pod/perli18n.pod b/pod/perli18n.pod
deleted file mode 100644
index aea6b4ac57..0000000000
--- a/pod/perli18n.pod
+++ /dev/null
@@ -1,234 +0,0 @@
-=head1 NAME
-
-perl18n - Perl i18n (internalization)
-
-=head1 DESCRIPTION
-
-Perl supports the language-specific notions of data like
-"is this a letter" and "which letter comes first". These
-are very important issues especially for languages other
-than English -- but also for English: it would be very
-naïve indeed to think that C<A-Za-z> defines all the "letters".
-
-Perl understands the language-specific data via the standardized
-(ISO C, XPG4, POSIX 1.c) method called "the locale system".
-The locale system is controlled per application using one
-function call and several environment variables.
-
-=head1 USING LOCALES
-
-If your operating system supports the locale system and you have
-installed the locale system and you have set your locale environment
-variables correctly (please see below) before running Perl, Perl will
-understand your data correctly according to your locale settings.
-
-In runtime you can switch locales using the POSIX::setlocale().
-
- # setlocale is the function call
- # LC_CTYPE will be explained later
-
- use POSIX qw(setlocale LC_CTYPE);
-
- # query and save the old locale.
- $old_locale = setlocale(LC_CTYPE);
-
- setlocale(LC_CTYPE, "fr_CA.ISO8859-1");
- # LC_CTYPE now in locale "French, Canada, codeset ISO 8859-1"
-
- setlocale(LC_CTYPE, "");
- # LC_CTYPE now in locale what the LC_ALL / LC_CTYPE / LANG define.
- # see below for documentation about the LC_ALL / LC_CTYPE / LANG.
-
- # restore the old locale
- setlocale(LC_CTYPE, $old_locale);
-
-The first argument of C<setlocale()> is called B<the category> and the
-second argument B<the locale>. The category tells in what aspect of
-data processing we want to apply language-specific rules, the locale
-tells in what language-country/territory-codeset - but read on for the
-naming of the locales: not all systems name locales as in the example.
-
-For further information about the categories, please consult your
-L<setlocale(3)> manual. For the locales available in your system,
-also consult the L<setlocale(3)> manual and see whether it leads you
-to the list of the available locales (search for the C<SEE ALSO>
-section). If that fails, try out in command line the following
-commands:
-
-=over 12
-
-=item locale -a
-
-=item nlsinfo
-
-=item ls /usr/lib/nls/loc
-
-=item ls /usr/lib/locale
-
-=item ls /usr/lib/nls
-
-=back
-
-and see whether they list something resembling these
-
- en_US.ISO8859-1 de_DE.ISO8859-1 ru_RU.ISO8859-5
- en_US de_DE ru_RU
- en de ru
- english german russian
- english.iso88591 german.iso88591 russian.iso88595
-
-Sadly enough even if the calling interface has been standardized the
-names of the locales are not. The naming usually is
-language_country/territory.codeset but the latter parts may not be
-present.
-
-Two special locales are worth special mention: C<"C"> and C<"POSIX">.
-Currently and effectively these are the same locale: the difference is
-mainly that the first one is defined by the C standard and the second
-one is defined by the POSIX standard. What they mean and define is
-the B<default locale> in which every program does start in. The
-language is (American) English and the character codeset C<ASCII>.
-B<NOTE>: Not all systems have the C<"POSIX"> locale (not all systems
-are POSIX), so use the C<"C"> locale when you need the default locale.
-
-=head2 The C<use locale> Pragma
-
-By default, Perl ignores the current locale. The C<use locale> pragma
-tells Perl to use the current locale for some operations: The
-comparison functions (lt, le, eq, cmp, ne, ge, gt, sort) use
-C<LC_COLLATE>; regular expressions and case-modification functions
-(uc, lc, ucfirst, lcfirst) use C<LC_CTYPE>; and formatting functions
-(printf and sprintf) use C<LC_NUMERIC>. The default behavior returns
-with C<no locale> or by reaching the end of the enclosing block.
-
-Note that the result of any operation that uses locale information is
-tainted, since locales can be created by unprivileged users on some
-systems (see L<perlsec.pod>).
-
-=head2 Category LC_COLLATE: Collation
-
-When in the scope of C<use locale>, Perl obeys the B<LC_COLLATE>
-environment variable which controls application's notions on the
-collation (ordering) of the characters. C<B> does in most Latin
-alphabets follow the C<A> but where do the C<Á> and C<Ä> belong?
-
-B<NOTE>: Comparing and sorting by locale is usually slower than the
-default sorting; factors of 2 to 4 have been observed. It will also
-consume more memory: while a Perl scalar variable is participating in
-any string comparison or sorting operation and obeying the locale
-collation rules it will take about 3-15 (the exact value depends on
-the operating system) times more memory than normally. These downsides
-are dictated more by the operating system implementation of the locale
-system than by Perl.
-
-Here is a code snippet that will tell you what are the alphanumeric
-characters in the current locale, in the locale order:
-
- use POSIX qw(setlocale LC_COLLATE);
- use locale;
-
- setlocale(LC_COLLATE, "");
- print +(sort grep /\w/, map { chr() } 0..255), "\n";
-
-The default collation must be used for example for sorting raw binary
-data whereas the locale collation is useful for natural text.
-
-B<NOTE>: In some locales some characters may have no collation value
-at all -- this means for example if the C<'-'> is such a character the
-C<relocate> and C<re-locate> may sort to the same place.
-
-B<NOTE>: For certain environments the locale support by the operating
-system is very simply broken and cannot be used or fixed by Perl. Such
-deficiencies can and will result in mysterious hangs and/or Perl core
-dumps. One such example is IRIX before the release 6.2, the
-C<LC_COLLATE> support simply does not work. When confronted with such
-systems, please report in excruciating detail to C<perlbug@perl.com>,
-complain to your vendor, maybe some bug fixes exist for your operating
-system for these problems? Sometimes such bug fixes are called an
-operating system upgrade.
-
-B<NOTE>: In the pre-5.003_06 Perl releases the per-locale collation
-was possible using the C<I18N::Collate> library module. This is now
-mildly obsolete and to be avoided. The C<LC_COLLATE> functionality is
-integrated into the Perl core language and one can use scalar data
-completely normally -- there is no need to juggle with the scalar
-references of C<I18N::Collate>.
-
-=head2 Category LC_CTYPE: Character Types
-
-When in the scope of C<use locale>, Perl obeys the C<LC_CTYPE> locale
-information which controls application's notions on which characters
-are alphabetic characters. This affects in Perl the regular expression
-metanotation C<\\w> which stands for alphanumeric characters, that is,
-alphabetic and numeric characters (please consult L<perlre> for more
-information about regular expressions). Thanks to the C<LC_CTYPE>,
-depending on your locale settings, characters like C<Æ>, C<É>,
-C<ß>, C<ø>, may be understood as C<\w> characters.
-
-=head2 Category LC_NUMERIC: Numeric Formatting
-
-When in the scope of C<use locale>, Perl obeys the C<LC_NUMERIC>
-locale information which controls application's notions on how numbers
-should be formatted for input and output. This affects in Perl the
-printf and fprintf function, as well as POSIX::strtod.
-
-=head1 ENVIRONMENT
-
-=over 12
-
-=item PERL_BADLANG
-
-A string that controls whether Perl warns in its startup about failed
-locale settings. This can happen if the locale support in the
-operating system is lacking (broken) is some way. If this string has
-an integer value differing from zero, Perl will not complain.
-
-B<NOTE>: This is just hiding the warning message. The message tells
-about some problem in your system's locale support and you should
-investigate what the problem is.
-
-=back
-
-The following environment variables are not specific to Perl: They are
-part of the standardized (ISO C, XPG4, POSIX 1.c) setlocale method to
-control an application's opinion on data.
-
-=over 12
-
-=item LC_ALL
-
-C<LC_ALL> is the "override-all" locale environment variable. If it is
-set, it overrides all the rest of the locale environment variables.
-
-=item LC_CTYPE
-
-In the absence of C<LC_ALL>, C<LC_CTYPE> chooses the character type
-locale. In the absence of both C<LC_ALL> and C<LC_CTYPE>, C<LANG>
-chooses the character type locale.
-
-=item LC_COLLATE
-
-In the absence of C<LC_ALL>, C<LC_COLLATE> chooses the collation
-locale. In the absence of both C<LC_ALL> and C<LC_COLLATE>, C<LANG>
-chooses the collation locale.
-
-=item LC_NUMERIC
-
-In the absence of C<LC_ALL>, C<LC_NUMERIC> chooses the numeric format
-locale. In the absence of both C<LC_ALL> and C<LC_NUMERIC>, C<LANG>
-chooses the numeric format.
-
-=item LANG
-
-C<LANG> is the "catch-all" locale environment variable. If it is set,
-it is used as the last resort after the overall C<LC_ALL> and the
-category-specific C<LC_...>.
-
-=back
-
-There are further locale-controlling environment variables
-(C<LC_MESSAGES, LC_MONETARY, LC_TIME>) but Perl B<does not> currently
-use them, except possibly as they affect the behavior of library
-functions called by Perl extensions.
-
-=cut