From 30e12b924b57b15e707f1749f2e5af15f1c7fe09 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" <mst@redhat.com> Date: Sun, 27 Apr 2014 21:15:44 +0300 Subject: patch-id: make it stable against hunk reordering Patch id changes if users reorder file diffs that make up a patch. As the result is functionally equivalent, a different patch id is surprising to many users. In particular, reordering files using diff -O is helpful to make patches more readable (e.g. API header diff before implementation diff). Add an option to change patch-id behaviour making it stable against these kinds of patch change: calculate SHA1 hash for each hunk separately and sum all hashes (using a symmetrical sum) to get patch id We use a 20byte sum and not xor - since xor would give 0 output for patches that have two identical diffs, which isn't all that unlikely (e.g. append the same line in two places). The new behaviour is enabled - when patchid.stable is true - when --stable flag is present Using a new flag --unstable or setting patchid.stable to false force the historical behaviour. In the documentation, clarify that patch ID can now be a sum of hashes, not a hash. Document how command line and config options affect the behaviour. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Documentation/git-patch-id.txt | 37 ++++++++++++++++++++++++++++++++----- 1 file changed, 32 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/git-patch-id.txt b/Documentation/git-patch-id.txt index 312c3b1fe5..31efc587ee 100644 --- a/Documentation/git-patch-id.txt +++ b/Documentation/git-patch-id.txt @@ -8,14 +8,14 @@ git-patch-id - Compute unique ID for a patch SYNOPSIS -------- [verse] -'git patch-id' < <patch> +'git patch-id' [--stable | --unstable] < <patch> DESCRIPTION ----------- -A "patch ID" is nothing but a SHA-1 of the diff associated with a patch, with -whitespace and line numbers ignored. As such, it's "reasonably stable", but at -the same time also reasonably unique, i.e., two patches that have the same "patch -ID" are almost guaranteed to be the same thing. +A "patch ID" is nothing but a sum of SHA-1 of the file diffs associated with a +patch, with whitespace and line numbers ignored. As such, it's "reasonably +stable", but at the same time also reasonably unique, i.e., two patches that +have the same "patch ID" are almost guaranteed to be the same thing. IOW, you can use this thing to look for likely duplicate commits. @@ -27,6 +27,33 @@ This can be used to make a mapping from patch ID to commit ID. OPTIONS ------- + +--stable:: + Use a "stable" sum of hashes as the patch ID. With this option: + - Reordering file diffs that make up a patch does not affect the ID. + In particular, two patches produced by comparing the same two trees + with two different settings for "-O<orderfile>" result in the same + patch ID signature, thereby allowing the computed result to be used + as a key to index some meta-information about the change between + the two trees; + + - Result is different from the value produced by git 1.9 and older + or produced when an "unstable" hash (see --unstable below) is + configured - even when used on a diff output taken without any use + of "-O<orderfile>", thereby making existing databases storing such + "unstable" or historical patch-ids unusable. + + This is the default if patchid.stable is set to true. + +--unstable:: + Use an "unstable" hash as the patch ID. With this option, + the result produced is compatible with the patch-id value produced + by git 1.9 and older. Users with pre-existing databases storing + patch-ids produced by git 1.9 and older (who do not deal with reordered + patches) may want to use this option. + + This is the default. + <patch>:: The diff to create the ID of. -- cgit v1.2.1