diff options
author | Chip Salzenberg <chip@atlantic.net> | 1996-12-19 16:44:00 +1200 |
---|---|---|
committer | Chip Salzenberg <chip@atlantic.net> | 1996-12-19 16:44:00 +1200 |
commit | f2cbb5f7981d75896143accdc8bed8a11f580b2e (patch) | |
tree | 44303d895bc75f6c65c58e08f96a54df0218a06c /pod | |
parent | 858f93c38ca9fcfb9375581eee2cdb1aea7005ce (diff) | |
download | perl-f2cbb5f7981d75896143accdc8bed8a11f580b2e.tar.gz |
[shell changes from patch from perl5.003_11 to perl5.003_12]
Change from running these commands:
# create new directories
test -d lib/CPAN || mkdir lib/CPAN
test -d vms/ext/DCLsym || mkdir vms/ext/DCLsym
# be sure that new test is executable
touch t/op/recurse.t
chmod +x t/op/recurse.t
# get rid of old files
rm -f lib/splain
rm -f old_embed.pl
rm -f old_global.sym
rm -f old_perl_exp.SH
rm -f pod/perli18n.pod
rm -f t/re_tests
# ready to patch
exit 0
Diffstat (limited to 'pod')
-rw-r--r-- | pod/perli18n.pod | 234 |
1 files changed, 0 insertions, 234 deletions
diff --git a/pod/perli18n.pod b/pod/perli18n.pod deleted file mode 100644 index aea6b4ac57..0000000000 --- a/pod/perli18n.pod +++ /dev/null @@ -1,234 +0,0 @@ -=head1 NAME - -perl18n - Perl i18n (internalization) - -=head1 DESCRIPTION - -Perl supports the language-specific notions of data like -"is this a letter" and "which letter comes first". These -are very important issues especially for languages other -than English -- but also for English: it would be very -naïve indeed to think that C<A-Za-z> defines all the "letters". - -Perl understands the language-specific data via the standardized -(ISO C, XPG4, POSIX 1.c) method called "the locale system". -The locale system is controlled per application using one -function call and several environment variables. - -=head1 USING LOCALES - -If your operating system supports the locale system and you have -installed the locale system and you have set your locale environment -variables correctly (please see below) before running Perl, Perl will -understand your data correctly according to your locale settings. - -In runtime you can switch locales using the POSIX::setlocale(). - - # setlocale is the function call - # LC_CTYPE will be explained later - - use POSIX qw(setlocale LC_CTYPE); - - # query and save the old locale. - $old_locale = setlocale(LC_CTYPE); - - setlocale(LC_CTYPE, "fr_CA.ISO8859-1"); - # LC_CTYPE now in locale "French, Canada, codeset ISO 8859-1" - - setlocale(LC_CTYPE, ""); - # LC_CTYPE now in locale what the LC_ALL / LC_CTYPE / LANG define. - # see below for documentation about the LC_ALL / LC_CTYPE / LANG. - - # restore the old locale - setlocale(LC_CTYPE, $old_locale); - -The first argument of C<setlocale()> is called B<the category> and the -second argument B<the locale>. The category tells in what aspect of -data processing we want to apply language-specific rules, the locale -tells in what language-country/territory-codeset - but read on for the -naming of the locales: not all systems name locales as in the example. - -For further information about the categories, please consult your -L<setlocale(3)> manual. For the locales available in your system, -also consult the L<setlocale(3)> manual and see whether it leads you -to the list of the available locales (search for the C<SEE ALSO> -section). If that fails, try out in command line the following -commands: - -=over 12 - -=item locale -a - -=item nlsinfo - -=item ls /usr/lib/nls/loc - -=item ls /usr/lib/locale - -=item ls /usr/lib/nls - -=back - -and see whether they list something resembling these - - en_US.ISO8859-1 de_DE.ISO8859-1 ru_RU.ISO8859-5 - en_US de_DE ru_RU - en de ru - english german russian - english.iso88591 german.iso88591 russian.iso88595 - -Sadly enough even if the calling interface has been standardized the -names of the locales are not. The naming usually is -language_country/territory.codeset but the latter parts may not be -present. - -Two special locales are worth special mention: C<"C"> and C<"POSIX">. -Currently and effectively these are the same locale: the difference is -mainly that the first one is defined by the C standard and the second -one is defined by the POSIX standard. What they mean and define is -the B<default locale> in which every program does start in. The -language is (American) English and the character codeset C<ASCII>. -B<NOTE>: Not all systems have the C<"POSIX"> locale (not all systems -are POSIX), so use the C<"C"> locale when you need the default locale. - -=head2 The C<use locale> Pragma - -By default, Perl ignores the current locale. The C<use locale> pragma -tells Perl to use the current locale for some operations: The -comparison functions (lt, le, eq, cmp, ne, ge, gt, sort) use -C<LC_COLLATE>; regular expressions and case-modification functions -(uc, lc, ucfirst, lcfirst) use C<LC_CTYPE>; and formatting functions -(printf and sprintf) use C<LC_NUMERIC>. The default behavior returns -with C<no locale> or by reaching the end of the enclosing block. - -Note that the result of any operation that uses locale information is -tainted, since locales can be created by unprivileged users on some -systems (see L<perlsec.pod>). - -=head2 Category LC_COLLATE: Collation - -When in the scope of C<use locale>, Perl obeys the B<LC_COLLATE> -environment variable which controls application's notions on the -collation (ordering) of the characters. C<B> does in most Latin -alphabets follow the C<A> but where do the C<Á> and C<Ä> belong? - -B<NOTE>: Comparing and sorting by locale is usually slower than the -default sorting; factors of 2 to 4 have been observed. It will also -consume more memory: while a Perl scalar variable is participating in -any string comparison or sorting operation and obeying the locale -collation rules it will take about 3-15 (the exact value depends on -the operating system) times more memory than normally. These downsides -are dictated more by the operating system implementation of the locale -system than by Perl. - -Here is a code snippet that will tell you what are the alphanumeric -characters in the current locale, in the locale order: - - use POSIX qw(setlocale LC_COLLATE); - use locale; - - setlocale(LC_COLLATE, ""); - print +(sort grep /\w/, map { chr() } 0..255), "\n"; - -The default collation must be used for example for sorting raw binary -data whereas the locale collation is useful for natural text. - -B<NOTE>: In some locales some characters may have no collation value -at all -- this means for example if the C<'-'> is such a character the -C<relocate> and C<re-locate> may sort to the same place. - -B<NOTE>: For certain environments the locale support by the operating -system is very simply broken and cannot be used or fixed by Perl. Such -deficiencies can and will result in mysterious hangs and/or Perl core -dumps. One such example is IRIX before the release 6.2, the -C<LC_COLLATE> support simply does not work. When confronted with such -systems, please report in excruciating detail to C<perlbug@perl.com>, -complain to your vendor, maybe some bug fixes exist for your operating -system for these problems? Sometimes such bug fixes are called an -operating system upgrade. - -B<NOTE>: In the pre-5.003_06 Perl releases the per-locale collation -was possible using the C<I18N::Collate> library module. This is now -mildly obsolete and to be avoided. The C<LC_COLLATE> functionality is -integrated into the Perl core language and one can use scalar data -completely normally -- there is no need to juggle with the scalar -references of C<I18N::Collate>. - -=head2 Category LC_CTYPE: Character Types - -When in the scope of C<use locale>, Perl obeys the C<LC_CTYPE> locale -information which controls application's notions on which characters -are alphabetic characters. This affects in Perl the regular expression -metanotation C<\\w> which stands for alphanumeric characters, that is, -alphabetic and numeric characters (please consult L<perlre> for more -information about regular expressions). Thanks to the C<LC_CTYPE>, -depending on your locale settings, characters like C<Æ>, C<É>, -C<ß>, C<ø>, may be understood as C<\w> characters. - -=head2 Category LC_NUMERIC: Numeric Formatting - -When in the scope of C<use locale>, Perl obeys the C<LC_NUMERIC> -locale information which controls application's notions on how numbers -should be formatted for input and output. This affects in Perl the -printf and fprintf function, as well as POSIX::strtod. - -=head1 ENVIRONMENT - -=over 12 - -=item PERL_BADLANG - -A string that controls whether Perl warns in its startup about failed -locale settings. This can happen if the locale support in the -operating system is lacking (broken) is some way. If this string has -an integer value differing from zero, Perl will not complain. - -B<NOTE>: This is just hiding the warning message. The message tells -about some problem in your system's locale support and you should -investigate what the problem is. - -=back - -The following environment variables are not specific to Perl: They are -part of the standardized (ISO C, XPG4, POSIX 1.c) setlocale method to -control an application's opinion on data. - -=over 12 - -=item LC_ALL - -C<LC_ALL> is the "override-all" locale environment variable. If it is -set, it overrides all the rest of the locale environment variables. - -=item LC_CTYPE - -In the absence of C<LC_ALL>, C<LC_CTYPE> chooses the character type -locale. In the absence of both C<LC_ALL> and C<LC_CTYPE>, C<LANG> -chooses the character type locale. - -=item LC_COLLATE - -In the absence of C<LC_ALL>, C<LC_COLLATE> chooses the collation -locale. In the absence of both C<LC_ALL> and C<LC_COLLATE>, C<LANG> -chooses the collation locale. - -=item LC_NUMERIC - -In the absence of C<LC_ALL>, C<LC_NUMERIC> chooses the numeric format -locale. In the absence of both C<LC_ALL> and C<LC_NUMERIC>, C<LANG> -chooses the numeric format. - -=item LANG - -C<LANG> is the "catch-all" locale environment variable. If it is set, -it is used as the last resort after the overall C<LC_ALL> and the -category-specific C<LC_...>. - -=back - -There are further locale-controlling environment variables -(C<LC_MESSAGES, LC_MONETARY, LC_TIME>) but Perl B<does not> currently -use them, except possibly as they affect the behavior of library -functions called by Perl extensions. - -=cut |