diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2001-11-17 22:22:47 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2001-11-17 22:22:47 +0000 |
commit | 72ff290864ea88cc224b5d3af7058f500755f94a (patch) | |
tree | f234643fe093a72f93714de2121657434ab42612 /pod/perlre.pod | |
parent | c8795d8b7ccb16a95758a094cc4a0572927cb4cc (diff) | |
download | perl-72ff290864ea88cc224b5d3af7058f500755f94a.tar.gz |
Banish "use utf8".
p4raw-id: //depot/perl@13064
Diffstat (limited to 'pod/perlre.pod')
-rw-r--r-- | pod/perlre.pod | 15 |
1 files changed, 9 insertions, 6 deletions
diff --git a/pod/perlre.pod b/pod/perlre.pod index 6c687495cb..5c7e76b5ad 100644 --- a/pod/perlre.pod +++ b/pod/perlre.pod @@ -184,7 +184,9 @@ In addition, Perl defines the following: \PP Match non-P \X Match eXtended Unicode "combining character sequence", equivalent to C<(?:\PM\pM*)> - \C Match a single C char (octet) even under utf8. + \C Match a single C char (octet) even under Unicode. + B<NOTE:> breaks up characters into their UTF-8 bytes, + so you may end up with malformed pieces of UTF-8. A C<\w> matches a single alphanumeric character or C<_>, not a whole word. Use C<\w+> to match a string of Perl-identifier characters (which isn't @@ -193,7 +195,7 @@ list of alphabetic characters generated by C<\w> is taken from the current locale. See L<perllocale>. You may use C<\w>, C<\W>, C<\s>, C<\S>, C<\d>, and C<\D> within character classes, but if you try to use them as endpoints of a range, that's not a range, the "-" is understood literally. -See L<utf8> for details about C<\pP>, C<\PP>, and C<\X>. +See L<perlunicode> for details about C<\pP>, C<\PP>, and C<\X>. The POSIX character class syntax @@ -230,9 +232,10 @@ whole character class. For example: matches zero, one, any alphabetic character, and the percentage sign. -If the C<utf8> pragma is used, the following equivalences to Unicode -\p{} constructs and equivalent backslash character classes (if available), -will hold: +The following equivalences to Unicode \p{} constructs and equivalent +backslash character classes (if available), will hold: + + [:...:] \p{...} backslash alpha IsAlpha alnum IsAlnum @@ -291,7 +294,7 @@ work just fine) it is included for completeness. You can negate the [::] character classes by prefixing the class name with a '^'. This is a Perl extension. For example: - POSIX trad. Perl utf8 Perl + POSIX traditional Unicode [:^digit:] \D \P{IsDigit} [:^space:] \S \P{IsSpace} |