diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2002-01-04 04:04:05 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2002-01-04 04:04:05 +0000 |
commit | 08ce8fc6c711b96a43427a2fe173f1d81abd18c2 (patch) | |
tree | 1a5d38aeb448c7d00d0a71ed70d25ccc56a75e46 | |
parent | ac206dc8afc56c3af5eccaed5ba83dda4e594b6e (diff) | |
download | perl-08ce8fc6c711b96a43427a2fe173f1d81abd18c2.tar.gz |
Document the U+0085, U+2028, and U+2029.
p4raw-id: //depot/perl@14054
-rw-r--r-- | pod/perlre.pod | 22 | ||||
-rw-r--r-- | pod/perlretut.pod | 2 |
2 files changed, 14 insertions, 10 deletions
diff --git a/pod/perlre.pod b/pod/perlre.pod index feafb0e654..58cd6456f5 100644 --- a/pod/perlre.pod +++ b/pod/perlre.pod @@ -188,14 +188,18 @@ In addition, Perl defines the following: NOTE: breaks up characters into their UTF-8 bytes, so you may end up with malformed pieces of UTF-8. -A C<\w> matches a single alphanumeric character or C<_>, not a whole word. -Use C<\w+> to match a string of Perl-identifier characters (which isn't -the same as matching an English word). If C<use locale> is in effect, the -list of alphabetic characters generated by C<\w> is taken from the -current locale. See L<perllocale>. You may use C<\w>, C<\W>, C<\s>, C<\S>, +A C<\w> matches a single alphanumeric character (an alphabetic +character, or a decimal digit) or C<_>, not a whole word. Use C<\w+> +to match a string of Perl-identifier characters (which isn't the same +as matching an English word). If C<use locale> is in effect, the list +of alphabetic characters generated by C<\w> is taken from the current +locale. See L<perllocale>. You may use C<\w>, C<\W>, C<\s>, C<\S>, C<\d>, and C<\D> within character classes, but if you try to use them -as endpoints of a range, that's not a range, the "-" is understood literally. -See L<perlunicode> for details about C<\pP>, C<\PP>, and C<\X>. +as endpoints of a range, that's not a range, the "-" is understood +literally. If Unicode is in effect, C<\s> matches also "\x{85}", +"\x{2028}, and "\x{2029}", see L<perlunicode> for more details about +C<\pP>, C<\PP>, and C<\X>, and L<perluniintro> about Unicode in +general. The POSIX character class syntax @@ -228,11 +232,11 @@ A GNU extension equivalent to C<[ \t]>, `all horizontal whitespace'. =item [2] Not exactly equivalent to C<\s> since the C<[[:space:]]> includes -also the (very rare) `vertical tabulator', \ck", chr(11). +also the (very rare) `vertical tabulator', "\ck", chr(11). =item [3] -A Perl extension. +A Perl extension, see above. =back diff --git a/pod/perlretut.pod b/pod/perlretut.pod index 8f7c8cdd72..e90e03d602 100644 --- a/pod/perlretut.pod +++ b/pod/perlretut.pod @@ -1738,7 +1738,7 @@ traditional Unicode classes: IsPrint /^([LMNPS]|Co|Zs)/ IsPunct /^P/ IsSpace /^Z/ || ($code =~ /^(0009|000A|000B|000C|000D)$/ - IsSpacePerl /^Z/ || ($code =~ /^(0009|000A|000C|000D)$/ + IsSpacePerl /^Z/ || ($code =~ /^(0009|000A|000C|000D|0085|2028|2029)$/ IsUpper /^L[ut]/ IsWord /^[LMN]/ || $code eq "005F" IsXDigit $code =~ /^00(3[0-9]|[46][1-6])$/ |