From fde18df140d5f64815bdd632a127ecd5ce3d97fa Mon Sep 17 00:00:00 2001 From: Jarkko Hietaniemi Date: Thu, 16 Jan 2003 01:58:39 +0000 Subject: Make the locale-induced UTF-8-ification of STD fhs and the default file open layer explicit (either -C or PERL_UTF8_LOCALE), instead of implicit (and unasked-for). p4raw-id: //depot/perl@18490 --- pod/perlrun.pod | 20 +++++++++++++++----- pod/perlunicode.pod | 21 +++++++++------------ pod/perluniintro.pod | 16 +++++++++------- pod/perlvar.pod | 25 ++++++++++--------------- 4 files changed, 43 insertions(+), 39 deletions(-) (limited to 'pod') diff --git a/pod/perlrun.pod b/pod/perlrun.pod index 72517122a4..46e18493d4 100644 --- a/pod/perlrun.pod +++ b/pod/perlrun.pod @@ -266,11 +266,21 @@ An alternate delimiter may be specified using B<-F>. =item B<-C> -enables Perl to use the native wide character APIs on the target system. -The magic variable C<${^WIDE_SYSTEM_CALLS}> reflects the state of -this switch. See L. - -This feature is currently only implemented on the Win32 platform. +enables Perl to use the Unicode APIs on the target system. + +As of Perl 5.8.1, if C<-C> is used and the locale settings (the LC_ALL, +LC_CTYPE, and LANG environment variables) indicate a UTF-8 locale, +the STDIN is expected to be in UTF-8, the STDOUT and STDERR are +expected to be in UTF-8, and C<:utf8> is the default file open layer. +See L, L, and L for more information. +The magic variable C<${^UTF8_LOCALE}> reflects this state, +see L. (Another way of setting this +variable is to set the environment variable PERL_UTF8_LOCALE.) + +(In Perls earlier than 5.8.1 the C<-C> switch was a Win32-only switch +that enabled the use of Unicode-aware "wide system call" Win32 APIs. +This feature was practically unused, however, and the command line +switch was therefore "recycled".) =item B<-c> diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index ee8b6efe7e..1d3f84626f 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -67,13 +67,6 @@ character data. Such data may come from filehandles, from calls to external programs, from information provided by the system (such as %ENV), or from literals and constants in the source text. -On Windows platforms, if the C<-C> command line switch is used or the -${^WIDE_SYSTEM_CALLS} global flag is set to C<1>, all system calls -will use the corresponding wide-character APIs. This feature is -available only on Windows to conform to the API standard already -established for that platform--and there are very few non-Windows -platforms that have Unicode-aware APIs. - The C pragma will always, regardless of platform, force byte semantics in a particular lexical scope. See L. @@ -1050,10 +1043,14 @@ there are a couple of exceptions: =item * -If your locale environment variables (LANGUAGE, LC_ALL, LC_CTYPE, LANG) -contain the strings 'UTF-8' or 'UTF8' (case-insensitive matching), -the default encodings of your STDIN, STDOUT, and STDERR, and of -B, are considered to be UTF-8. +If your locale environment variables (LC_ALL, LC_CTYPE, LANG) +contain the strings 'UTF-8' or 'UTF8' (matched case-insensitively) +B you enable using UTF-8 either by using the C<-C> command line +switch or setting the PERL_UTF8_LOCALE environment variable to a true +value, then the default encodings of your STDIN, STDOUT, and STDERR, +and of B, are considered to be UTF-8. +See L, L, and L for more +information. The magic variable C<${^UTF8_LOCALE}> will also be set. =item * @@ -1410,6 +1407,6 @@ the UTF-8 flag: =head1 SEE ALSO L, L, L, L, L, L, -L, L +L, L =cut diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index 21f0fa7600..3a2346004c 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -172,13 +172,15 @@ To output UTF-8, use the C<:utf8> output layer. Prepending to this sample program ensures that the output is completely UTF-8, and removes the program's warning. -If your locale environment variables (C, C, -C, C) contain the strings 'UTF-8' or 'UTF8', -regardless of case, then the default encoding of your STDIN, STDOUT, -and STDERR and of B, is UTF-8. Note that -this means that Perl expects other software to work, too: if Perl has -been led to believe that STDIN should be UTF-8, but then STDIN coming -in from another command is not UTF-8, Perl will complain about the +If your locale environment variables (C, C, C) +contain the strings 'UTF-8' or 'UTF8' (matched case-insensitively) +B you enable using UTF-8 either by using the C<-C> command line +switch or by setting the PERL_UTF8_LOCALE environment variable to +a true value, then the default encoding of your STDIN, STDOUT, and +STDERR, and of B, is UTF-8. Note that this +means that Perl expects other software to work, too: if Perl has been +led to believe that STDIN should be UTF-8, but then STDIN coming in +from another command is not UTF-8, Perl will complain about the malformed UTF-8. All features that combine Unicode and I/O also require using the new diff --git a/pod/perlvar.pod b/pod/perlvar.pod index 08235c2cb4..7621be0c0d 100644 --- a/pod/perlvar.pod +++ b/pod/perlvar.pod @@ -1109,6 +1109,16 @@ Reflects if taint mode is on or off. 1 for on (the program was run with B<-T>), 0 for off, -1 when only taint warnings are enabled (i.e. with B<-t> or B<-TU>). This variable is read-only. +=item ${^UTF8_LOCALE} + +Reflects whether the locale settings indicated the use of UTF-8 and that +the use of UTF-8 was enabled either by the C<-C> command line switch or +by setting the PERL_UTF8_LOCALE environment variable to a true value. +This variable is read-only. If true, the STDIN is expected to be in +UTF-8, the STDOUT and STDERR are in UTF-8, and C<:utf8> is the default +file open layer. See L, L, and L +for more information. + =item $PERL_VERSION =item $^V @@ -1148,21 +1158,6 @@ related to the B<-w> switch.) See also L. The current set of warning checks enabled by the C pragma. See the documentation of C for more details. -=item ${^WIDE_SYSTEM_CALLS} - -Global flag that enables system calls made by Perl to use wide character -APIs native to the system, if available. This is currently only implemented -on the Windows platform. - -This can also be enabled from the command line using the C<-C> switch. - -The initial value is typically C<0> for compatibility with Perl versions -earlier than 5.6, but may be automatically set to C<1> by Perl if the system -provides a user-settable default (e.g., C<$ENV{LC_CTYPE}>). - -The C pragma always overrides the effect of this flag in the current -lexical scope. See L. - =item $EXECUTABLE_NAME =item $^X -- cgit v1.2.1