diff options
author | Father Chrysostomos <sprout@cpan.org> | 2011-10-25 15:40:40 -0700 |
---|---|---|
committer | Father Chrysostomos <sprout@cpan.org> | 2011-10-26 18:22:18 -0700 |
commit | 1bb8785ab1af03172a3a220f8948d33bdc3dd374 (patch) | |
tree | a14c14464b4aa65c51fe30d11a6b79bac2bb3313 /metaconfig.SH | |
parent | edfed4c3099abba2a8b83e8dde6bcff0952c07f5 (diff) | |
download | perl-1bb8785ab1af03172a3a220f8948d33bdc3dd374.tar.gz |
Rewrite csh_glob in C; fix two quoting bugs
This commit rewrites File::Glob::csh_glob (which implements perl’s
default globbing behaviour) in C.
This fixes a problem introduced by 0b0e6d70f. If there is an
unmatched quotation mark, all attempts to parse the pattern are
discarded and it is treated as a single token. Prior to 0b0e6d70f,
whitespace was stripped from both ends in that case. As of 0b0e6d70f,
it was only stripped from the beginning. This commit restores the
pre-0b0e6d70f behaviour with unmatched quotes. It doesn’t take
'a"b\ ' into account (where the space is escaped), but that wasn’t
handled properly before 0b0e6d70f, either.
This also finishes making csh_glob consistent with regard to quota-
tion marks. Commit 0b0e6d70f attempted to do that, but did not strip
out medial quotation marks, as in a"b"c. Text::ParseWords does not
provide an interface for stripping out quotation marks but leaving
backslashes, which I tried to work around, not fully understanding
the implications. Anyway, this new C implementation doesn’t use
Text::ParseWords.
The latter fix caused a test failure, but that test was there to make
sure the behaviour didn’t change depending on whether File::Glob
was loaded before the first mention of glob(). (In 5.6, loading
File::Glob first would make perl revert to external csh glob, ironic-
ally enough.) This commit modifies the test to test for sameness,
rather than exact output. In fact, this change causes perl and
miniperl to be consistent, and probably also causes glob to be more
consistent across platforms (think of VMS).
Another effect of the translation to C is that the Unicode Bug is
fixed with regard to splitting patterns. The C code effectively does
/\s/a now (which I believe is the only sane behaviour in this case),
instead of treating the string differently depending on the UTF8 flag.
The Unicode Bug is still present with regard to actual globbing.
This commit introduces one regression. This code:
undef %File::Glob::;
glob("nometachars");
will no longer return anything, because csh_glob no longer holds a
reference count on the $File::Glob::DEFAULT_FLAGS glob. Any code that
does that is beyond crazy.
The big advantage to this patch is speed. Something like
‘@files = <*>’ is 18% faster in a folder of 300 files. For smaller
folders there should be an even more notable difference.
Diffstat (limited to 'metaconfig.SH')
0 files changed, 0 insertions, 0 deletions