diff options
author | Karl Williamson <public@khwilliamson.com> | 2013-12-23 20:35:54 -0700 |
---|---|---|
committer | Karl Williamson <public@khwilliamson.com> | 2013-12-31 08:27:23 -0700 |
commit | 2d88a86a5910c97496b47b7b7c223f2c9a14b57c (patch) | |
tree | c0125ea6a9b6175c93245c4048773ae82e0f4efc /pod/perlrecharclass.pod | |
parent | f215ab38f4d9ea2dca08fc71b38db0eb650d5107 (diff) | |
download | perl-2d88a86a5910c97496b47b7b7c223f2c9a14b57c.tar.gz |
Change \p{} matching for above-Unicode code points
http://markmail.org/message/eod7ukhbbh5tnll4 is the beginning of the
thread that led to this commit.
This commit revises the handling of \p{} and \P{} to treat above-Unicode
code points as typical Unicode unassigned ones, and only output a
warning during matching when the answer is arguable under strict Unicode
rules (that is "matched" for \p{}, and "didn't match" for \P{}). The
exception is if the warning category has been made fatal, then it tries
hard to always output the warning. The definition of \p{All} is changed
to be qr/./s, and no warning is issued at all for matching it against
above-Unicode code points.
Diffstat (limited to 'pod/perlrecharclass.pod')
-rw-r--r-- | pod/perlrecharclass.pod | 18 |
1 files changed, 11 insertions, 7 deletions
diff --git a/pod/perlrecharclass.pod b/pod/perlrecharclass.pod index a8ee854d15..ee033634e8 100644 --- a/pod/perlrecharclass.pod +++ b/pod/perlrecharclass.pod @@ -389,15 +389,19 @@ It is also possible to define your own properties. This is discussed in L<perlunicode/User-Defined Character Properties>. Unicode properties are defined (surprise!) only on Unicode code points. -A warning is raised and all matches fail on non-Unicode code points -(those above the legal Unicode maximum of 0x10FFFF). This can be -somewhat surprising, +Starting in v5.20, when matching against C<\p> and C<\P>, Perl treats +non-Unicode code points (those above the legal Unicode maximum of +0x10FFFF) as if they were typical unassigned Unicode code points. - chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails. - chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Also fails! +Prior to v5.20, Perl raised a warning and made all matches fail on +non-Unicode code points. This could be somewhat surprising: -Even though these two matches might be thought of as complements, they -are so only on Unicode code points. + chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails on Perls < v5.20. + chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Also fails on Perls + # < v5.20 + +Even though these two matches might be thought of as complements, until +v5.20 they were so only on Unicode code points. =head4 Examples |