diff options
author | Derrick Stolee <dstolee@microsoft.com> | 2019-12-13 18:09:53 +0000 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2019-12-13 12:01:02 -0800 |
commit | 190a65f9db8db9d87d54351429f7879fcb4ad608 (patch) | |
tree | 69147479da50321ae236a5910783c6d7a62e365f /dir.c | |
parent | cff4e9138d8df45e3b6199171092ee781cdadaeb (diff) | |
download | git-190a65f9db8db9d87d54351429f7879fcb4ad608.tar.gz |
sparse-checkout: respect core.ignoreCase in cone mode
When a user uses the sparse-checkout feature in cone mode, they
add patterns using "git sparse-checkout set <dir1> <dir2> ..."
or by using "--stdin" to provide the directories line-by-line over
stdin. This behaviour naturally looks a lot like the way a user
would type "git add <dir1> <dir2> ..."
If core.ignoreCase is enabled, then "git add" will match the input
using a case-insensitive match. Do the same for the sparse-checkout
feature.
Perform case-insensitive checks while updating the skip-worktree
bits during unpack_trees(). This is done by changing the hash
algorithm and hashmap comparison methods to optionally use case-
insensitive methods.
When this is enabled, there is a small performance cost in the
hashing algorithm. To tease out the worst possible case, the
following was run on a repo with a deep directory structure:
git ls-tree -d -r --name-only HEAD |
git sparse-checkout set --stdin
The 'set' command was timed with core.ignoreCase disabled or
enabled. For the repo with a deep history, the numbers were
core.ignoreCase=false: 62s
core.ignoreCase=true: 74s (+19.3%)
For reproducibility, the equivalent test on the Linux kernel
repository had these numbers:
core.ignoreCase=false: 3.1s
core.ignoreCase=true: 3.6s (+16%)
Now, this is not an entirely fair comparison, as most users
will define their sparse cone using more shallow directories,
and the performance improvement from eb42feca97 ("unpack-trees:
hash less in cone mode" 2019-11-21) can remove most of the
hash cost. For a more realistic test, drop the "-r" from the
ls-tree command to store only the first-level directories.
In that case, the Linux kernel repository takes 0.2-0.25s in
each case, and the deep repository takes one second, plus or
minus 0.05s, in each case.
Thus, we _can_ demonstrate a cost to this change, but it is
unlikely to matter to any reasonable sparse-checkout cone.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'dir.c')
-rw-r--r-- | dir.c | 15 |
1 files changed, 12 insertions, 3 deletions
@@ -625,6 +625,8 @@ int pl_hashmap_cmp(const void *unused_cmp_data, ? ee1->patternlen : ee2->patternlen; + if (ignore_case) + return strncasecmp(ee1->pattern, ee2->pattern, min_len); return strncmp(ee1->pattern, ee2->pattern, min_len); } @@ -665,7 +667,9 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern translated->pattern = truncated; translated->patternlen = given->patternlen - 2; hashmap_entry_init(&translated->ent, - memhash(translated->pattern, translated->patternlen)); + ignore_case ? + strihash(translated->pattern) : + strhash(translated->pattern)); if (!hashmap_get_entry(&pl->recursive_hashmap, translated, ent, NULL)) { @@ -694,7 +698,9 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern translated->pattern = xstrdup(given->pattern); translated->patternlen = given->patternlen; hashmap_entry_init(&translated->ent, - memhash(translated->pattern, translated->patternlen)); + ignore_case ? + strihash(translated->pattern) : + strhash(translated->pattern)); hashmap_add(&pl->recursive_hashmap, &translated->ent); @@ -724,7 +730,10 @@ static int hashmap_contains_path(struct hashmap *map, /* Check straight mapping */ p.pattern = pattern->buf; p.patternlen = pattern->len; - hashmap_entry_init(&p.ent, memhash(p.pattern, p.patternlen)); + hashmap_entry_init(&p.ent, + ignore_case ? + strihash(p.pattern) : + strhash(p.pattern)); return !!hashmap_get_entry(map, &p, ent, NULL); } |