summaryrefslogtreecommitdiff
path: root/pod/perlunicode.pod
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2010-11-15 13:51:24 -0700
committerFather Chrysostomos <sprout@cpan.org>2010-11-22 13:32:57 -0800
commit2283d3262a3ac5434ac8c6682ac0f76d9af24090 (patch)
treecb67925378cb299033dbdb4a0c4b3ae67e8715cc /pod/perlunicode.pod
parent7c7ce6a76faad86329b21010f476ccd95dfe9af4 (diff)
downloadperl-2283d3262a3ac5434ac8c6682ac0f76d9af24090.tar.gz
[bracketed char class] fixes
This patch adds two functions for setting the ANYOF node bitmaps. The one for dealing with folds has intelligence as to what to do if unicode semantics is in effect. Together with previous commits, this fixes the unicode bug for bracketed character classes, as far as known bugs go, so pods are updated as well.
Diffstat (limited to 'pod/perlunicode.pod')
-rw-r--r--pod/perlunicode.pod29
1 files changed, 25 insertions, 4 deletions
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod
index 978bede654..b950f7bf73 100644
--- a/pod/perlunicode.pod
+++ b/pod/perlunicode.pod
@@ -1515,10 +1515,31 @@ support seamlessly. The result wasn't seamless: these characters were
orphaned.
Work is being done to correct this, but only some of it is complete.
-What has been finished is the matching of C<\b>, C<\s>, C<\w> and the Posix
-character classes and their complements in regular expressions, and the
-important part of the case
-changing component. Due to concerns, and some evidence, that older code might
+What has been finished is:
+
+=over
+
+=item *
+
+the matching of C<\b>, C<\s>, C<\w> and the Posix
+character classes and their complements in regular expressions
+
+=item *
+
+case changing (but not user-defined casing)
+
+=item *
+
+case-insensitive (C</i>) regular expression matching for [bracketed
+character classes] only, except for some bugs with C<LATIN SMALL
+LETTER SHARP S> (which is supposed to match the two character sequence
+"ss" (or "Ss" or "sS" or "SS"), but Perl has a number of bugs for all
+such multi-character case insensitive characters, of which this is just
+one example.
+
+=back
+
+Due to concerns, and some evidence, that older code might
have come to rely on the existing behavior, the new behavior must be explicitly
enabled by the feature C<unicode_strings> in the L<feature> pragma, even though
no new syntax is involved.