diff options
author | Russell Belfer <rb@github.com> | 2013-01-02 17:14:00 -0800 |
---|---|---|
committer | Russell Belfer <rb@github.com> | 2013-01-04 15:47:43 -0800 |
commit | 77cffa31db07187c2fa65457ace1b6cb2547dc5b (patch) | |
tree | 95228829b8f5f4db980e3f37501d9b4eed20addf /docs | |
parent | b3fb9237c215e9a0e2e042afd9252d541ce40541 (diff) | |
download | libgit2-77cffa31db07187c2fa65457ace1b6cb2547dc5b.tar.gz |
Simplify checkout documentation
This moves a lot of the detailed checkout documentation into a new
file (docs/checkout-internals.md) and simplifies the public docs
for the checkout API.
Diffstat (limited to 'docs')
-rw-r--r-- | docs/checkout-internals.md | 203 |
1 files changed, 203 insertions, 0 deletions
diff --git a/docs/checkout-internals.md b/docs/checkout-internals.md new file mode 100644 index 000000000..cb646da5d --- /dev/null +++ b/docs/checkout-internals.md @@ -0,0 +1,203 @@ +Checkout Internals +================== + +Checkout has to handle a lot of different cases. It examines the +differences between the target tree, the baseline tree and the working +directory, plus the contents of the index, and groups files into five +categories: + +1. UNMODIFIED - Files that match in all places. +2. SAFE - Files where the working directory and the baseline content + match that can be safely updated to the target. +3. DIRTY/MISSING - Files where the working directory differs from the + baseline but there is no conflicting change with the target. One + example is a file that doesn't exist in the working directory - no + data would be lost as a result of writing this file. Which action + will be taken with these files depends on the options you use. +4. CONFLICTS - Files where changes in the working directory conflict + with changes to be applied by the target. If conflicts are found, + they prevent any other modifications from being made (although there + are options to override that and force the update, of course). +5. UNTRACKED/IGNORED - Files in the working directory that are untracked + or ignored (i.e. only in the working directory, not the other places). + +Right now, this classification is done via 3 iterators (for the three +trees), with a final lookup in the index. At some point, this may move to +a 4 iterator version to incorporate the index better. + +The actual checkout is done in five phases (at least right now). + +1. The diff between the baseline and the target tree is used as a base + list of possible updates to be applied. +2. Iterate through the diff and the working directory, building a list of + actions to be taken (and sending notifications about conflicts and + dirty files). +3. Remove any files / directories as needed (because alphabetical + iteration means that an untracked directory will end up sorted *after* + a blob that should be checked out with the same name). +4. Update all blobs. +5. Update all submodules (after 4 in case a new .gitmodules blob was + checked out) + +Checkout could be driven either off a target-to-workdir diff or a +baseline-to-target diff. There are pros and cons of each. + +Target-to-workdir means the diff includes every file that could be +modified, which simplifies bookkeeping, but the code to constantly refer +back to the baseline gets complicated. + +Baseline-to-target has simpler code because the diff defines the action to +take, but needs special handling for untracked and ignored files, if they +need to be removed. + +The current checkout implementation is based on a baseline-to-target diff. + + +Picking Actions +=============== + +The most interesting aspect of this is phase 2, picking the actions that +should be taken. There are a lot of corner cases, so it may be easier to +start by looking at the rules for a simple 2-iterator diff: + +Key +--- +- B1,B2,B3 - blobs with different SHAs, +- Bi - ignored blob (WD only) +- T1,T2,T3 - trees with different SHAs, +- Ti - ignored tree (WD only) +- x - nothing + +Diff with 2 non-workdir iterators +--------------------------------- + + Old New + --- --- + 0 x x - nothing + 1 x B1 - added blob + 2 x T1 - added tree + 3 B1 x - removed blob + 4 B1 B1 - unmodified blob + 5 B1 B2 - modified blob + 6 B1 T1 - typechange blob -> tree + 7 T1 x - removed tree + 8 T1 B1 - typechange tree -> blob + 9 T1 T1 - unmodified tree + 10 T1 T2 - modified tree (implies modified/added/removed blob inside) + + +Now, let's make the "New" iterator into a working directory iterator, so +we replace "added" items with either untracked or ignored, like this: + +Diff with non-work & workdir iterators +-------------------------------------- + + Old New-WD + --- ------ + 0 x x - nothing + 1 x B1 - untracked blob + 2 x Bi - ignored file + 3 x T1 - untracked tree + 4 x Ti - ignored tree + 5 B1 x - removed blob + 6 B1 B1 - unmodified blob + 7 B1 B2 - modified blob + 8 B1 T1 - typechange blob -> tree + 9 B1 Ti - removed blob AND ignored tree as separate items + 10 T1 x - removed tree + 11 T1 B1 - typechange tree -> blob + 12 T1 Bi - removed tree AND ignored blob as separate items + 13 T1 T1 - unmodified tree + 14 T1 T2 - modified tree (implies modified/added/removed blob inside) + +Note: if there is a corresponding entry in the old tree, then a working +directory item won't be ignored (i.e. no Bi or Ti for tracked items). + + +Now, expand this to three iterators: a baseline tree, a target tree, and +an actual working directory tree: + +Checkout From 3 Iterators (2 not workdir, 1 workdir) +---------------------------------------------------- + +(base == old HEAD; target == what to checkout; actual == working dir) + + base target actual/workdir + ---- ------ ------ + 0 x x x - nothing + 1 x x B1/Bi/T1/Ti - untracked/ignored blob/tree (SAFE) + 2+ x B1 x - add blob (SAFE) + 3 x B1 B1 - independently added blob (FORCEABLE-2) + 4* x B1 B2/Bi/T1/Ti - add blob with content conflict (FORCEABLE-2) + 5+ x T1 x - add tree (SAFE) + 6* x T1 B1/Bi - add tree with blob conflict (FORCEABLE-2) + 7 x T1 T1/i - independently added tree (SAFE+MISSING) + 8 B1 x x - independently deleted blob (SAFE+MISSING) + 9- B1 x B1 - delete blob (SAFE) + 10- B1 x B2 - delete of modified blob (FORCEABLE-1) + 11 B1 x T1/Ti - independently deleted blob AND untrack/ign tree (SAFE+MISSING !!!) + 12 B1 B1 x - locally deleted blob (DIRTY || SAFE+CREATE) + 13+ B1 B2 x - update to deleted blob (SAFE+MISSING) + 14 B1 B1 B1 - unmodified file (SAFE) + 15 B1 B1 B2 - locally modified file (DIRTY) + 16+ B1 B2 B1 - update unmodified blob (SAFE) + 17 B1 B2 B2 - independently updated blob (FORCEABLE-1) + 18+ B1 B2 B3 - update to modified blob (FORCEABLE-1) + 19 B1 B1 T1/Ti - locally deleted blob AND untrack/ign tree (DIRTY) + 20* B1 B2 T1/Ti - update to deleted blob AND untrack/ign tree (F-1) + 21+ B1 T1 x - add tree with locally deleted blob (SAFE+MISSING) + 22* B1 T1 B1 - add tree AND deleted blob (SAFE) + 23* B1 T1 B2 - add tree with delete of modified blob (F-1) + 24 B1 T1 T1 - add tree with deleted blob (F-1) + 25 T1 x x - independently deleted tree (SAFE+MISSING) + 26 T1 x B1/Bi - independently deleted tree AND untrack/ign blob (F-1) + 27- T1 x T1 - deleted tree (MAYBE SAFE) + 28+ T1 B1 x - deleted tree AND added blob (SAFE+MISSING) + 29 T1 B1 B1 - independently typechanged tree -> blob (F-1) + 30+ T1 B1 B2 - typechange tree->blob with conflicting blob (F-1) + 31* T1 B1 T1/T2 - typechange tree->blob (MAYBE SAFE) + 32+ T1 T1 x - restore locally deleted tree (SAFE+MISSING) + 33 T1 T1 B1/Bi - locally typechange tree->untrack/ign blob (DIRTY) + 34 T1 T1 T1/T2 - unmodified tree (MAYBE SAFE) + 35+ T1 T2 x - update locally deleted tree (SAFE+MISSING) + 36* T1 T2 B1/Bi - update to tree with typechanged tree->blob conflict (F-1) + 37 T1 T2 T1/T2/T3 - update to existing tree (MAYBE SAFE) + +The number is followed by ' ' if no change is needed or '+' if the case +needs to write to disk or '-' if something must be deleted and '*' if +there should be a delete followed by an write. + +There are four tiers of safe cases: + +- SAFE == completely safe to update +- SAFE+MISSING == safe except the workdir is missing the expect content +- MAYBE SAFE == safe if workdir tree matches (or is missing) baseline + content, which is unknown at this point +- FORCEABLE == conflict unless FORCE is given +- DIRTY == no conflict but change is not applied unless FORCE + +Some slightly unusual circumstances: + + 8 - parent dir is only deleted when file is, so parent will be left if + empty even though it would be deleted if the file were present + 11 - core git does not consider this a conflict but attempts to delete T1 + and gives "unable to unlink file" error yet does not skip the rest + of the operation + 12 - without FORCE file is left deleted (i.e. not restored) so new wd is + dirty (and warning message "D file" is printed), with FORCE, file is + restored. + 24 - This should be considered MAYBE SAFE since effectively it is 7 and 8 + combined, but core git considers this a conflict unless forced. + 26 - This combines two cases (1 & 25) (and also implied 8 for tree content) + which are ok on their own, but core git treat this as a conflict. + If not forced, this is a conflict. If forced, this actually doesn't + have to write anything and leaves the new blob as an untracked file. + 32 - This is the only case where the baseline and target values match + and yet we will still write to the working directory. In all other + cases, if baseline == target, we don't touch the workdir (it is + either already right or is "dirty"). However, since this case also + implies that a ?/B1/x case will exist as well, it can be skipped. + +Cases 3, 17, 24, 26, and 29 are all considered conflicts even though +none of them will require making any updates to the working directory. + |