diff options
author | Karl Williamson <khw@cpan.org> | 2015-02-17 15:03:32 -0700 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2015-02-19 22:55:01 -0700 |
commit | 64935bc6975bb01af403817752e88d6540c8711d (patch) | |
tree | 6a1619ac46d1501a5b296a2b69d9af0b98db8c58 /pod/perlunicode.pod | |
parent | 0e0b935601a8b7a2c56653412a94a36f986bc34f (diff) | |
download | perl-64935bc6975bb01af403817752e88d6540c8711d.tar.gz |
Add qr/\b{gcb}/
A function implements seeing if the space between any two characters is
a grapheme cluster break. Afer I wrote this, I realized that an array
lookup might be a better implementation, but the deadline for v5.22 was
too close to change it. I did see that my gcc optimized it down to
an array lookup.
This makes the implementation of \X go from being complicated to
trivial.
Diffstat (limited to 'pod/perlunicode.pod')
-rw-r--r-- | pod/perlunicode.pod | 8 |
1 files changed, 5 insertions, 3 deletions
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index 0482d92596..ee99198e2d 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -1100,7 +1100,8 @@ Level 2 - Extended Unicode Support [10] see UAX#15 "Unicode Normalization Forms" [11] have Unicode::Normalize but not integrated to regexes - [12] have \X but we don't have a "Grapheme Cluster Mode" + [12] have \X and \b{gcb} but we don't have a "Grapheme Cluster + Mode" [14] see UAX#29, Word Boundaries [15] This is covered in Chapter 3.13 (in Unicode 6.0) @@ -1575,8 +1576,9 @@ regular expressions outside the scope. =item * -Matching any of several properties in regular expressions, namely C<\b>, -C<\B>, C<\s>, C<\S>, C<\w>, C<\W>, and all the Posix character classes +Matching any of several properties in regular expressions, namely +C<\b> (without braces), C<\B> (without braces), C<\s>, C<\S>, C<\w>, +C<\W>, and all the Posix character classes I<except> C<[[:ascii:]]>. Starting in Perl 5.14.0, regular expressions compiled within the scope of C<unicode_strings> use character semantics |