regen/regcharclass_multi_char_folds.pl: Add some comments

author: Karl Williamson <khw@cpan.org> 2014-04-28 19:44:28 -0600
committer: Karl Williamson <khw@cpan.org> 2014-05-30 10:24:27 -0600
commit: b07262fd84b1e9ea4e247dfe6afa0f01f5bf0980 (patch)
tree: 5e87af5bc0912bb7fb6928837f4d5e3bdebf6d33 /regen
parent: f71bd7891602029c8bb23300877f55d2a3971262 (diff)
download: perl-b07262fd84b1e9ea4e247dfe6afa0f01f5bf0980.tar.gz
1 files changed, 13 insertions, 6 deletions
diff --git a/regen/regcharclass_multi_char_folds.pl b/regen/regcharclass_multi_char_folds.pl
index 7a4c2a6d96..caee865069 100644
--- a/regen/regcharclass_multi_char_folds.pl
+++ b/regen/regcharclass_multi_char_folds.pl
@@ -15,12 +15,19 @@ use Unicode::UCD "prop_invmap";
 # this code is designed to help regcomp.c, and EXACTFish regnodes.  For
 # non-UTF-8 patterns, the strings are not folded, so we need to check for the
 # upper and lower case versions.  For UTF-8 patterns, the strings are folded,
-# so we only need to worry about the fold version.  There are no non-ASCII
-# Latin1 multi-char folds currently, and none likely to be ever added.  Thus
-# the output is the same as if it were just asking for ASCII characters, not
-# full Latin1.  Hence, it is suitable for generating things that match
-# EXACTFA.  It does check for and croak if there ever were to be an upper
-# Latin1 range multi-character fold.
+# except in EXACTFL nodes) so we only need to worry about the fold version.
+# All folded-to characters in non-UTF-8 (Latin1) are members of fold-pairs,
+# at least within Latin1, 'k', and 'K', for example.  So there aren't
+# complications with dealing with unfolded input.  That's not true of UTF-8
+# patterns, where things can get tricky.  Thus for EXACTFL nodes where things
+# aren't all folded, code has to be written specially to handle this, instead
+# of the macros here being extended to try to handle it.
+#
+# There are no non-ASCII Latin1 multi-char folds currently, and none likely to
+# be ever added.  Thus the output is the same as if it were just asking for
+# ASCII characters, not full Latin1.  Hence, it is suitable for generating
+# things that match EXACTFA.  It does check for and croak if there ever were
+# to be an upper Latin1 range multi-character fold.
 #
 # This is designed for input to regen/regcharlass.pl.
author	Karl Williamson <khw@cpan.org>	2014-04-28 19:44:28 -0600
committer	Karl Williamson <khw@cpan.org>	2014-05-30 10:24:27 -0600
commit	b07262fd84b1e9ea4e247dfe6afa0f01f5bf0980 (patch)
tree	5e87af5bc0912bb7fb6928837f4d5e3bdebf6d33 /regen
parent	f71bd7891602029c8bb23300877f55d2a3971262 (diff)
download	perl-b07262fd84b1e9ea4e247dfe6afa0f01f5bf0980.tar.gz