summaryrefslogtreecommitdiff
path: root/pod/perlrecharclass.pod
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2013-12-23 20:35:54 -0700
committerKarl Williamson <public@khwilliamson.com>2013-12-31 08:27:23 -0700
commit2d88a86a5910c97496b47b7b7c223f2c9a14b57c (patch)
treec0125ea6a9b6175c93245c4048773ae82e0f4efc /pod/perlrecharclass.pod
parentf215ab38f4d9ea2dca08fc71b38db0eb650d5107 (diff)
downloadperl-2d88a86a5910c97496b47b7b7c223f2c9a14b57c.tar.gz
Change \p{} matching for above-Unicode code points
http://markmail.org/message/eod7ukhbbh5tnll4 is the beginning of the thread that led to this commit. This commit revises the handling of \p{} and \P{} to treat above-Unicode code points as typical Unicode unassigned ones, and only output a warning during matching when the answer is arguable under strict Unicode rules (that is "matched" for \p{}, and "didn't match" for \P{}). The exception is if the warning category has been made fatal, then it tries hard to always output the warning. The definition of \p{All} is changed to be qr/./s, and no warning is issued at all for matching it against above-Unicode code points.
Diffstat (limited to 'pod/perlrecharclass.pod')
-rw-r--r--pod/perlrecharclass.pod18
1 files changed, 11 insertions, 7 deletions
diff --git a/pod/perlrecharclass.pod b/pod/perlrecharclass.pod
index a8ee854d15..ee033634e8 100644
--- a/pod/perlrecharclass.pod
+++ b/pod/perlrecharclass.pod
@@ -389,15 +389,19 @@ It is also possible to define your own properties. This is discussed in
L<perlunicode/User-Defined Character Properties>.
Unicode properties are defined (surprise!) only on Unicode code points.
-A warning is raised and all matches fail on non-Unicode code points
-(those above the legal Unicode maximum of 0x10FFFF). This can be
-somewhat surprising,
+Starting in v5.20, when matching against C<\p> and C<\P>, Perl treats
+non-Unicode code points (those above the legal Unicode maximum of
+0x10FFFF) as if they were typical unassigned Unicode code points.
- chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails.
- chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Also fails!
+Prior to v5.20, Perl raised a warning and made all matches fail on
+non-Unicode code points. This could be somewhat surprising:
-Even though these two matches might be thought of as complements, they
-are so only on Unicode code points.
+ chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails on Perls < v5.20.
+ chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Also fails on Perls
+ # < v5.20
+
+Even though these two matches might be thought of as complements, until
+v5.20 they were so only on Unicode code points.
=head4 Examples