diff options
author | Jeffrey Friedl <jfriedl@regex.info> | 2000-07-16 10:55:29 -0700 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2000-08-18 21:07:06 +0000 |
commit | aaa51d5e11b8b0db616a7f939c784733b4cfef87 (patch) | |
tree | 5cc57e412e80d1006256d7edc7526dd927cbe2bd /pod/perlretut.pod | |
parent | 2a4ebaa641b7ba24b2dcfc940bb2b5da27d05b4e (diff) | |
download | perl-aaa51d5e11b8b0db616a7f939c784733b4cfef87.tar.gz |
Add [[:blank:]] as suggested in
Subject: [ID 20000716.024] [=cc=] / [:blank:]
Message-Id: <200007170055.RAA23528@fummy.dsl.yahoo.com>
(the [=cc=] has already been taken care of by #6439
so the whole bug report can be closed)
and make [[:space:]] to be equivalent to isspace(3)
(as opposed to \s, which is isSPACE()). The difference
is that now [[:space:]] matches the mythical vertical tab,
while \s doesn't.
p4raw-id: //depot/perl@6703
Diffstat (limited to 'pod/perlretut.pod')
-rw-r--r-- | pod/perlretut.pod | 18 |
1 files changed, 10 insertions, 8 deletions
diff --git a/pod/perlretut.pod b/pod/perlretut.pod index 66f8179ab6..87669e50ab 100644 --- a/pod/perlretut.pod +++ b/pod/perlretut.pod @@ -1672,15 +1672,17 @@ i.e., a non-mark followed by one or more marks. As if all those classes weren't enough, Perl also defines POSIX style character classes. These have the form C<[:name:]>, with C<name> the -name of the POSIX class. The POSIX classes are alpha, alnum, ascii, -cntrl, digit, graph, lower, print, punct, space, upper, word, and -xdigit. If C<utf8> is being used, then these classes are defined the -same as their corresponding perl Unicode classes: C<[:upper:]> is the -same as C<\p{IsUpper}>, etc. The POSIX character classes, however, -don't require using C<utf8>. The C<[:digit:]>, C<[:word:]>, and +name of the POSIX class. The POSIX classes are C<alpha>, C<alnum>, +C<ascii>, C<cntrl>, C<digit>, C<graph>, C<lower>, C<print>, C<punct>, +C<space>, C<upper>, and C<xdigit>, and two extensions, C<word> (a Perl +extension to match C<\w>), and C<blank> (a GNU extension). If C<utf8> +is being used, then these classes are defined the same as their +corresponding perl Unicode classes: C<[:upper:]> is the same as +C<\p{IsUpper}>, etc. The POSIX character classes, however, don't +require using C<utf8>. The C<[:digit:]>, C<[:word:]>, and C<[:space:]> correspond to the familiar C<\d>, C<\w>, and C<\s> -character classes. To negate a POSIX class, put a C<^> in front of the -name, so that, e.g., C<[:^digit:]> corresponds to C<\D> and under +character classes. To negate a POSIX class, put a C<^> in front of +the name, so that, e.g., C<[:^digit:]> corresponds to C<\D> and under C<utf8>, C<\P{IsDigit}>. The Unicode and POSIX character classes can be used just like C<\d>, both inside and outside of character classes: |