From 5c9f9bf313e872934c9b4563b7b0bcdedb39eb86 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 3 Jun 2016 03:07:34 -0400 Subject: rev-list: "adjust" results of "--count --use-bitmap-index -n" If you ask rev-list for: git rev-list --count --use-bitmap-index HEAD we optimize out the actual traversal and just give you the number of bits set in the commit bitmap. This is faster, which is good. But if you ask to limit the size of the traversal, like: git rev-list --count --use-bitmap-index -n 100 HEAD we'll still output the full bitmapped number we found. On the surface, that might even seem OK. You explicitly asked to use the bitmap index, and it was cheap to compute the real answer, so we gave it to you. But there's something much more complicated going on under the hood. If we don't have a bitmap directly for HEAD, then we have to actually traverse backwards, looking for a bitmapped commit. And _that_ traversal is bounded by our `-n` count. This is a good thing, because it bounds the work we have to do, which is probably what the user wanted by asking for `-n`. But now it makes the output quite confusing. You might get many values: - your `-n` value, if we walked back and never found a bitmap (or fewer if there weren't that many commits) - the actual full count, if we found a bitmap root for every path of our traversal with in the `-n` limit - any number in between! We might have walked back and found _some_ bitmaps, but then cut off the traversal early with some commits not accounted for in the result. So you cannot even see a value higher than your `-n` and say "OK, bitmaps kicked in, this must be the real full count". The only sane thing is for git to just clamp the value to a maximum of the `-n` value, which means we should output the exact same results whether bitmaps are in use or not. The test in t5310 demonstrates this by using `-n 1`. Without this patch we fail in the full-bitmap case (where we do not have to traverse at all) but _not_ in the partial-bitmap case (where we have to walk down to find an actual bitmap). With this patch, both cases just work. I didn't implement the crazy in-between case, just because it's complicated to set up, and is really a subset of the full-count case, which we do cover. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- builtin/rev-list.c | 3 +++ 1 file changed, 3 insertions(+) (limited to 'builtin') diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 7ae255862a..a963564617 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -355,8 +355,11 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) if (use_bitmap_index && !revs.prune) { if (revs.count && !revs.left_right && !revs.cherry_mark) { uint32_t commit_count; + int max_count = revs.max_count; if (!prepare_bitmap_walk(&revs)) { count_bitmap_commit_list(&commit_count, NULL, NULL, NULL); + if (max_count >= 0 && max_count < commit_count) + commit_count = max_count; printf("%d\n", commit_count); return 0; } -- cgit v1.2.1 From fb85db84dcc36dcc3cdd5fc744280334bbe8eb47 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 3 Jun 2016 03:08:05 -0400 Subject: rev-list: disable bitmaps when "-n" is used with listing objects You can ask rev-list to use bitmaps to speed up an --objects traversal, which should generally give you your answers much faster. Likewise, you can ask rev-list to limit such a traversal with `-n`, in which case we'll show only a limited set of commits (and only the tree and commit objects directly reachable from those commits). But if you do both together, the results are nonsensical. We end up limiting any fallback traversal we do to _find_ the bitmaps, but the actual set of objects we output will be picked arbitrarily from the union of any bitmaps we do find, and will involve the objects of many more commits. It's possible that somebody might want this as a "show me what you can, but limit the amount of work you do" flag. But as with the prior commit clamping "--count", the results are basically non-deterministic; you'll get the values from some commits between `n` and the total number, and you can't tell which. And unlike the `--count` case, we can't easily generate the "real" value from the bitmap values (you can't just walk back `-n` commits and subtract out the reachable objects from the boundary commits; the bitmaps for `X` record its total reachability, so you don't know which objects are directly from `X` itself, which from `X^`, and so on). So let's just fallback to the non-bitmap code path in this case, so we always give a sane answer. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- builtin/rev-list.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'builtin') diff --git a/builtin/rev-list.c b/builtin/rev-list.c index a963564617..7597be5b3a 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -363,7 +363,8 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) printf("%d\n", commit_count); return 0; } - } else if (revs.tag_objects && revs.tree_objects && revs.blob_objects) { + } else if (revs.max_count < 0 && + revs.tag_objects && revs.tree_objects && revs.blob_objects) { if (!prepare_bitmap_walk(&revs)) { traverse_bitmap_commit_list(&show_object_fast); return 0; -- cgit v1.2.1