summaryrefslogtreecommitdiff
path: root/regexec.c
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2011-03-19 19:19:50 -0600
committerKarl Williamson <public@khwilliamson.com>2011-03-19 21:48:33 -0600
commit8d5d17fad2e8a7a4ca7bd0e424933fd94274f607 (patch)
tree9343af29cae53d2b11d9bead1a44a5dd9ae12e22 /regexec.c
parenta4e790c1e104e81d3916c3ff82ac9854ff247966 (diff)
downloadperl-8d5d17fad2e8a7a4ca7bd0e424933fd94274f607.tar.gz
regexec.c: Update comment
Diffstat (limited to 'regexec.c')
-rw-r--r--regexec.c38
1 files changed, 13 insertions, 25 deletions
diff --git a/regexec.c b/regexec.c
index 0be7eda5a2..93e1417a3b 100644
--- a/regexec.c
+++ b/regexec.c
@@ -6628,31 +6628,19 @@ S_reginclass(pTHX_ const regexp * const prog, register const regnode * const n,
else if (flags & ANYOF_LOC_NONBITMAP_FOLD) {
/* Here, we need to test if the fold of the target string
- * matches. In the case of a multi-char fold that is
- * caught by regcomp.c, it has stored all such folds into
- * 'av'; we linearly check to see if any match the target
- * string (folded). We know that the originals were each
- * one character, but we don't currently know how many
- * characters/bytes each folded to, except we do know that
- * there are small limits imposed by Unicode. XXX A
- * performance enhancement would be to have regcomp.c store
- * the max number of chars/bytes that are in an av entry,
- * as, say the 0th element. Even better would be to have a
- * hash of the few characters that can start a multi-char
- * fold to the max number of chars of those folds.
- *
- * Further down, if there isn't a
- * match in the av, we will check if there is another
- * fold-type match. For that, we also need the fold, but
- * only the first character. No sense in folding it twice,
- * so we do it here, even if there isn't any multi-char
- * fold, so we always fold at least the first character.
- * If the node is a straight ANYOF node, or there is only
- * one character available in the string, or if there isn't
- * any av, that's all we have to fold. In the case of a
- * multi-char fold, we do have guarantees in Unicode that
- * it can only expand up to so many characters and so many
- * bytes. We keep track so don't exceed either.
+ * matches. The non-multi char folds have all been moved to
+ * the compilation phase, and the multi-char folds have
+ * been stored by regcomp into 'av'; we linearly check to
+ * see if any match the target string (folded). We know
+ * that the originals were each one character, but we don't
+ * currently know how many characters/bytes each folded to,
+ * except we do know that there are small limits imposed by
+ * Unicode. XXX A performance enhancement would be to have
+ * regcomp.c store the max number of chars/bytes that are
+ * in an av entry, as, say the 0th element. Even better
+ * would be to have a hash of the few characters that can
+ * start a multi-char fold to the max number of chars of
+ * those folds.
*
* If there is a match, we will need to advance (if lenp is
* specified) the match pointer in the target string. But