summaryrefslogtreecommitdiff
path: root/pod/perlre.pod
diff options
context:
space:
mode:
authorJarkko Hietaniemi <jhi@iki.fi>2001-11-17 22:22:47 +0000
committerJarkko Hietaniemi <jhi@iki.fi>2001-11-17 22:22:47 +0000
commit72ff290864ea88cc224b5d3af7058f500755f94a (patch)
treef234643fe093a72f93714de2121657434ab42612 /pod/perlre.pod
parentc8795d8b7ccb16a95758a094cc4a0572927cb4cc (diff)
downloadperl-72ff290864ea88cc224b5d3af7058f500755f94a.tar.gz
Banish "use utf8".
p4raw-id: //depot/perl@13064
Diffstat (limited to 'pod/perlre.pod')
-rw-r--r--pod/perlre.pod15
1 files changed, 9 insertions, 6 deletions
diff --git a/pod/perlre.pod b/pod/perlre.pod
index 6c687495cb..5c7e76b5ad 100644
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -184,7 +184,9 @@ In addition, Perl defines the following:
\PP Match non-P
\X Match eXtended Unicode "combining character sequence",
equivalent to C<(?:\PM\pM*)>
- \C Match a single C char (octet) even under utf8.
+ \C Match a single C char (octet) even under Unicode.
+ B<NOTE:> breaks up characters into their UTF-8 bytes,
+ so you may end up with malformed pieces of UTF-8.
A C<\w> matches a single alphanumeric character or C<_>, not a whole word.
Use C<\w+> to match a string of Perl-identifier characters (which isn't
@@ -193,7 +195,7 @@ list of alphabetic characters generated by C<\w> is taken from the
current locale. See L<perllocale>. You may use C<\w>, C<\W>, C<\s>, C<\S>,
C<\d>, and C<\D> within character classes, but if you try to use them
as endpoints of a range, that's not a range, the "-" is understood literally.
-See L<utf8> for details about C<\pP>, C<\PP>, and C<\X>.
+See L<perlunicode> for details about C<\pP>, C<\PP>, and C<\X>.
The POSIX character class syntax
@@ -230,9 +232,10 @@ whole character class. For example:
matches zero, one, any alphabetic character, and the percentage sign.
-If the C<utf8> pragma is used, the following equivalences to Unicode
-\p{} constructs and equivalent backslash character classes (if available),
-will hold:
+The following equivalences to Unicode \p{} constructs and equivalent
+backslash character classes (if available), will hold:
+
+ [:...:] \p{...} backslash
alpha IsAlpha
alnum IsAlnum
@@ -291,7 +294,7 @@ work just fine) it is included for completeness.
You can negate the [::] character classes by prefixing the class name
with a '^'. This is a Perl extension. For example:
- POSIX trad. Perl utf8 Perl
+ POSIX traditional Unicode
[:^digit:] \D \P{IsDigit}
[:^space:] \S \P{IsSpace}