diff options
author | Karl Williamson <khw@cpan.org> | 2014-08-23 20:50:44 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2014-08-25 11:13:41 -0600 |
commit | e053ac115b3e736f1e52408b1ba193b2cf1f74ee (patch) | |
tree | 7a8edebcb6e33fbe344387cfc926fbce27a1c794 /embed.fnc | |
parent | 6424181e76be853870b9e777f403d093b1f8dfdd (diff) | |
download | perl-e053ac115b3e736f1e52408b1ba193b2cf1f74ee.tar.gz |
Improve -Dr output of bracketed char classes
I look at this output a lot to verify that patterns compiled correctly.
This commit makes them somewhat easier to read, while extending this to
also work on EBCDIC platforms (as yet untested).
In staring at these over time, I realized that punctuation literals are
mostly what contributes to being hard to read. [A-Z] is just as
readable as [A-Y], but [%!@\]~] is harder to read than if there were
fewer. Sometimes that can't be helped, but if many get output,
inverting the pattern [^...] can cause fewer to be output. This commit
employs heuristics to invert when it thinks that that would be more
legible. For example, it converts the output of [^"'] to be
ANYOF[^"'][{unicode_all}]
instead of
ANYOF[\x{00}-\x{1F} !#$%&()*+,\-./0-9:;<=>?@A-Z[\\\]\^_`a-z{|}~\x{7F}-\x{FF}][{unicode_all}]
Since it is a heuristic, it may not be the best under all circumstances,
and may need to be tweaked in the future.
If almost all the printables are to be output, it uses a hex range, as
that is probably more closely aligned with the intent of the pattern
than which individual printables are desired. Again this heuristic can
be tweaked.
And it prints a leading 0 on things it outputs as hex formerly as a
single digit \x{0A} now instead of \x{A} previously.
Diffstat (limited to 'embed.fnc')
-rw-r--r-- | embed.fnc | 3 |
1 files changed, 2 insertions, 1 deletions
@@ -2194,7 +2194,8 @@ Es |void |put_byte |NN SV* sv|int c Es |bool |put_charclass_bitmap_innards|NN SV* sv \ |NN char* bitmap \ |NULLOK SV** bitmap_invlist -Es |void |put_range |NN SV* sv|UV start|const UV end +Es |void |put_range |NN SV* sv|UV start|const UV end \ + |const bool allow_literals Es |void |dump_trie |NN const struct _reg_trie_data *trie\ |NULLOK HV* widecharmap|NN AV *revcharmap\ |U32 depth |