summaryrefslogtreecommitdiff
path: root/Configure
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2012-06-28 13:32:17 -0600
committerKarl Williamson <public@khwilliamson.com>2012-07-24 21:13:43 -0600
commite94e94b5ceeb265476690d9992a953b7d876f3a1 (patch)
tree06c23cac92667c6d749c6cf3ccd882aa5f077075 /Configure
parentf792674226f74e98903d6b00d08167effecfd8e9 (diff)
downloadperl-e94e94b5ceeb265476690d9992a953b7d876f3a1.tar.gz
mktables: Generate new table for foldable chars
This table consists of all characters that participate in any way in a fold in the current Unicode version. regcomp.c currently uses the Cased property as a proxy for these. This information is used to limit the number of characters whose folds have to be dealt with in compiling bracketed regex character classess. It turns out that Cased contains more than 1300 more code points than actually do appear in folds, which means potential extra work for compiling. Hence this patch allows that work to be avoided. There are a few characters in this new table that aren't in Cased, which are potential bugs in the old way of doing things. In Unicode 6.1, these are: U+02BC MODIFIER LETTER APOSTROPHE, U+0308 COMBINING DIAERESIS, U+0313 COMBINING COMMA ABOVE, and U+0342 COMBINING GREEK PERISPOMENI. I can't figure out how these might be currently causing a bug, but this patch fixes any such.
Diffstat (limited to 'Configure')
0 files changed, 0 insertions, 0 deletions