diff options
author | Karl Williamson <khw@cpan.org> | 2014-12-29 12:57:02 -0700 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2014-12-29 13:52:57 -0700 |
commit | dbf3c4d788344c8d20eb2549c638ced519d3f0e8 (patch) | |
tree | cd59ab76da0616cecc9793ebce9f0be3d0cd92a3 /pod/perllocale.pod | |
parent | d635b7101aac73db76a54016b58991ba7cd8d778 (diff) | |
download | perl-dbf3c4d788344c8d20eb2549c638ced519d3f0e8.tar.gz |
perllocale: Nits
Diffstat (limited to 'pod/perllocale.pod')
-rw-r--r-- | pod/perllocale.pod | 25 |
1 files changed, 14 insertions, 11 deletions
diff --git a/pod/perllocale.pod b/pod/perllocale.pod index d083c09d2f..17fddcb40d 100644 --- a/pod/perllocale.pod +++ b/pod/perllocale.pod @@ -298,8 +298,8 @@ C<ucfirst()>, and C<lcfirst()>) use C<LC_CTYPE> =item * -The variables L<$!|perlvar/$ERRNO> (and its synonyms C<$ERRNO> and -C<$OS_ERROR>) and L<$^E|perlvar/$EXTENDED_OS_ERROR> (and its synonym +B<The variables L<$!|perlvar/$ERRNO>> (and its synonyms C<$ERRNO> and +C<$OS_ERROR>) B<and L<$^E|perlvar/$EXTENDED_OS_ERROR>> (and its synonym C<$EXTENDED_OS_ERROR>) when used as strings use C<LC_MESSAGES>. =back @@ -755,7 +755,7 @@ alphabets, but where do "E<aacute>" and "E<aring>" belong? And while "color" follows "chocolate" in English, what about in traditional Spanish? The following collations all make sense and you may meet any of them -if you "use locale". +if you C<"use locale">. A B C D E a b c d e A a B b C c D d E e @@ -792,7 +792,7 @@ C<$equal_in_locale> will be true if the collation locale specifies a dictionary-like ordering that ignores space characters completely and which folds case. -Perl only supports single-byte locales for C<LC_COLLATE>. This means +Perl currently only supports single-byte locales for C<LC_COLLATE>. This means that a UTF-8 locale likely will just give you machine-native ordering. Use L<Unicode::Collate> for the full implementation of the Unicode Collation Algorithm. @@ -1005,7 +1005,7 @@ results. Here are a few possibilities: Regular expression checks for safe file names or mail addresses using C<\w> may be spoofed by an C<LC_CTYPE> locale that claims that -characters such as "E<gt>" and "|" are alphanumeric. +characters such as C<"E<gt>"> and C<"|"> are alphanumeric. =item * @@ -1466,9 +1466,12 @@ the characters in the upper half of the Latin-1 range (128 - 255) properly under C<LC_CTYPE>. To see if a character is a particular type under a locale, Perl uses the functions like C<isalnum()>. Your C library may not work for UTF-8 locales with those functions, instead -only working under the newer wide library functions like C<iswalnum()>. -However, they are treated like single-byte locales, and will have the -restrictions described below. +only working under the newer wide library functions like C<iswalnum()>, +which Perl does not use. +These multi-byte locales are treated like single-byte locales, and will +have the restrictions described below. Starting in Perl v5.22 a warning +message is raised when Perl detects a multi-byte locale that it doesn't +fully support. For single-byte locales, Perl generally takes the tack to use locale rules on code points that can fit @@ -1488,7 +1491,7 @@ Unicode, C<\p{Alpha}> will never match it, regardless of locale. A similar issue occurs with C<\N{...}>. Prior to v5.20, It is therefore a bad idea to use C<\p{}> or C<\N{}> under plain C<use locale>--I<unless> you can guarantee that the -locale will be a ISO8859-1. Use POSIX character classes instead. +locale will be ISO8859-1. Use POSIX character classes instead. Another problem with this approach is that operations that cross the single byte/multiple byte boundary are not well-defined, and so are @@ -1541,8 +1544,8 @@ Pre-v5.12, it was somewhat haphazard; in v5.12 it was applied fairly consistently to regular expression matching except for bracketed character classes; in v5.14 it was extended to all regex matches; and in v5.16 to the casing operations such as C<\L> and C<uc()>. For -collation, in all releases, the system's C<strxfrm()> function is called, -and whatever it does is what you get. +collation, in all releases so far, the system's C<strxfrm()> function is +called, and whatever it does is what you get. =head1 BUGS |