summaryrefslogtreecommitdiff
path: root/pod/perlrecharclass.pod
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2018-08-16 16:27:52 -0600
committerKarl Williamson <khw@cpan.org>2018-08-16 16:54:39 -0600
commit8350b2740abc0cad147113487148473a9e19034b (patch)
treeeec5b35258b4a08d72d703fb76bbf228bd2d4b70 /pod/perlrecharclass.pod
parent7da8e27b9d7d2be4e770d074405ddb9941e6c8b7 (diff)
downloadperl-8350b2740abc0cad147113487148473a9e19034b.tar.gz
perlre, perlrecharclass: Add examples
This adds more concrete cases of how mixed script digits can be hazardous.
Diffstat (limited to 'pod/perlrecharclass.pod')
-rw-r--r--pod/perlrecharclass.pod7
1 files changed, 5 insertions, 2 deletions
diff --git a/pod/perlrecharclass.pod b/pod/perlrecharclass.pod
index 3b5c5b12b1..225a092c05 100644
--- a/pod/perlrecharclass.pod
+++ b/pod/perlrecharclass.pod
@@ -109,14 +109,17 @@ security issues.
Some digits that C<\d> matches look like some of the [0-9] ones, but
have different values. For example, BENGALI DIGIT FOUR (U+09EA) looks
-very much like an ASCII DIGIT EIGHT (U+0038). An application that
+very much like an ASCII DIGIT EIGHT (U+0038), and LEPCHA DIGIT SIX
+(U+1C46) looks very much like an ASCII DIGIT FIVE (U+0035). An
+application that
is expecting only the ASCII digits might be misled, or if the match is
C<\d+>, the matched string might contain a mixture of digits from
different writing systems that look like they signify a number different
than they actually do. L<Unicode::UCD/num()> can
be used to safely
calculate the value, returning C<undef> if the input string contains
-such a mixture.
+such a mixture. Otherwise, for example, a displayed price might be
+deliberately different than it appears.
What C<\p{Digit}> means (and hence C<\d> except under the C</a>
modifier) is C<\p{General_Category=Decimal_Number}>, or synonymously,