Unicode::UCD: Use NFD(), not NFKD() for Hangul syllables

These decompositions are all canonical, not compatible only, so it gives the same result, and I think it's clearer, explained in the revised comment.
author: Karl Williamson <public@khwilliamson.com> 2011-08-28 12:23:36 -0600
committer: Karl Williamson <public@khwilliamson.com> 2011-09-02 11:57:44 -0600
commit: 94c91ffce4d762ae97191ff1793a44b41a0b93ca (patch)
tree: 2a5692b77f4313ae20728ff4ec0b13b60c5a2429 /lib
parent: eefd7bc2e29af165901c413239450ee3cde8cdfb (diff)
download: perl-94c91ffce4d762ae97191ff1793a44b41a0b93ca.tar.gz
1 files changed, 10 insertions, 8 deletions
diff --git a/lib/Unicode/UCD.pm b/lib/Unicode/UCD.pm
index 82ddb27d61..d359b48752 100644
--- a/lib/Unicode/UCD.pm
+++ b/lib/Unicode/UCD.pm
@@ -4,7 +4,7 @@ use strict;
 use warnings;
 no warnings 'surrogate';    # surrogates can be inputs to this
 use charnames ();
-use Unicode::Normalize qw(getCombinClass NFKD);
+use Unicode::Normalize qw(getCombinClass NFD);
 
 our $VERSION = '0.35';
 
@@ -340,18 +340,20 @@ sub charinfo {
     # For most code points, we can just read in "unicore/Decomposition.pl", as
     # its contents are exactly what should be output.  But that file doesn't
     # contain the data for the Hangul syllable decompositions, which can be
-    # algorithmically computed, and NFKD() does that, so we call NFKD() for
-    # those.  We can't use NFKD() for everything, as it does a complete
+    # algorithmically computed, and NFD() does that, so we call NFD() for
+    # those.  We can't use NFD() for everything, as it does a complete
     # recursive decomposition, and what this function has always done is to
-    # return what's in UnicodeData.txt which doesn't have the recursivenss
-    # specified in the decomposition types.  No decomposition implies an empty
-    # field; otherwise, all but "Canonical" imply a compatible decomposition,
-    # and the type is prefixed to that, as it is in UnicodeData.txt
+    # return what's in UnicodeData.txt which doesn't show that recursiveness.
+    # Fortunately, the NFD() of the Hanguls doesn't have any recursion
+    # issues.
+    # Having no decomposition implies an empty field; otherwise, all but
+    # "Canonical" imply a compatible decomposition, and the type is prefixed
+    # to that, as it is in UnicodeData.txt
     if ($char =~ /\p{Block=Hangul_Syllables}/) {
         # The code points of the decomposition are output in standard Unicode
         # hex format, separated by blanks.
         $prop{'decomposition'} = join " ", map { sprintf("%04X", $_)}
-                                           unpack "U*", NFKD($char);
+                                           unpack "U*", NFD($char);
     }
     else {
         @DECOMPOSITIONS = _read_table("unicore/Decomposition.pl")
author	Karl Williamson <public@khwilliamson.com>	2011-08-28 12:23:36 -0600
committer	Karl Williamson <public@khwilliamson.com>	2011-09-02 11:57:44 -0600
commit	94c91ffce4d762ae97191ff1793a44b41a0b93ca (patch)
tree	2a5692b77f4313ae20728ff4ec0b13b60c5a2429 /lib
parent	eefd7bc2e29af165901c413239450ee3cde8cdfb (diff)
download	perl-94c91ffce4d762ae97191ff1793a44b41a0b93ca.tar.gz