gitnotes
Table of Contents
- 1. Git repository philosophy
- 2. Maintainer vs. Contributor Overhead
- 3. Short-term change workflow, no remote collaboration planned
- 3.1. Fork repository to own github account
- 3.2. Create local clone of your github fork
- 3.3. Add main repository as a remote for your local clone
- 3.4. Create a working branch on your local clone
- 3.5. Regularly fetch from upstream
- 3.6. Regularly rebase your working branch to current upstream
- 3.7. (optional) Push your in-progress branch to fork/master marked as private
- 3.8. After your fix is finished/tested, do a final cleanup
- 3.9. Push your local branch to your github fork and initiate a pull request.
- 3.10. (optional) Make another local branch from the pull request branch
- 3.11. (optional) Intermediate/final rebases for long-running pull requests
- 3.12. After upstream merge, delete your change branches
- 4. Task-based collaboration branch workflow
- 5. Misc git tips
1 Git repository philosophy
The main repository should contain a mostly linear history composed of complete, well-tested, well-described changes that reflect relatively cohesive units of change. Any branch/merge points should be for long-running and well-defined units of work, and should similarly be mostly linear and have high-quality commits.
This makes it easy to see how development has progressed over time and what work was involved in creating new features and fixing particular issues. It also makes it easy to pull complete change-sets for features or issue fixes from one branch to another. This will be valuable to downstream maintainers of stable code lines for their products.
On the other hand, software development is sometimes a messy process and a version control system is invaluable to managing the daily process of changing a code base. Engineers should not wait for the "perfect change set" before committing changes. Fortunately, git provides mechanisms to support both aspects of software change management, though it requires application of a bit of discipline in the way git commands are applied to everyday work vs. merging complete changes with the official repository.
The dividing line falls between "public change history" vs. "private change history". Private history management is fluid and relatively un-structured, but maintained separately from public history in forked repos and local repo clones. It should involve numerous task-specific branches and should be viewed as a cooperative process between git and the user in developing a coherent change set for upstream. Individual commits are mostly unimportant in the long run; the only thing that matters is that the end result is a coherent, well-described and well-tested change to submit upstream.
Once a change is accepted upstream, however, it becomes a permanent record of the change it embodies. Every effort should be taken to make it correct and useful as an individual commit, because it will be forever a part of the repository history. No change should leave things in an inconsistent state, even if it only encompasses one aspect of a related set of changes. These kinds of commits don't just happen; they have to be carefully organized from the chaos of private branch history.
Linus's take on it:
http://thread.gmane.org/gmane.comp.video.dri.devel/34739/focus=34744
Articles on mechanics of merge and rebase:
http://www.derekgourlay.com/archives/428
http://mislav.uniqpath.com/2013/02/merge-vs-rebase/
https://medium.com/@porteneuve/getting-solid-at-git-rebase-vs-merge-4fa1a48c53aa
http://codeinthehole.com/writing/pull-requests-and-other-good-practices-for-teams-using-github/
2 Maintainer vs. Contributor Overhead
Plenty of potential maintainers and contributors are not yet familiar with using this style of distributed change management, and there is definitely some overhead involved in becoming familiar with using git this way. We should view the process of accepting contributions as a collaborative one in which we make an effort to guide maintainers and collaborators in learning how to prepare good pull requests.
When contributors are experienced and/or willing to learn proper procedures, we should politely suggest how to change their pull request to remove extraneous intermediate commits, merges, etc. at the beginning of the review process, as doing the required rebase will effectively end the pull request tracking. If the contributor doesn't have time or is unwilling, then we should again politely inform them why the submission can't be accepted in its current form and reject it.
If the change is otherwise good and provides an important needed feature or bug-fix, a maintainer should note that they'll be cherry-picking the change with attribution to the contributor but that it can't be merged as-is. The maintainer will have to manually fetch the pull request and rework it into a good change set before merging, then close the pull request with a note that it was merged manually.
Lower priority changes should just be closed after polite suggestions on how to re-work them into an acceptable form.
3 Short-term change workflow, no remote collaboration planned
This workflow applies to small bugfixes, small-scope changes, single-user contributions. etc. Although it may seem a little complicated, it really represents a model that treats upstream similarly to a centralized VCS while giving you the power to manage your own development with whatever version control practices suit you.
https://guides.github.com/activities/contributing-to-open-source/
3.1 Fork repository to own github account
This will be the public staging area for your contributions, and you'll create a branch here when you're ready for code review. This need only be done once per contributor/organization; you just need a github-managed clone you have write permission on from which to start pull requests.
3.2 Create local clone of your github fork
This will be the private staging area for your contributions. Private changes can be re-written by you at will, as no one else will be depending on the repository history structure. This only needs to be done once per workstation, of course.
git clone <your cloned repo>
3.3 Add main repository as a remote for your local clone
This will allow you to fetch updates from the upstream branch you're working on and also to push them to your remote fork from a single local work area.
The following creates a new remote named "upstream" that points to the main repo's open-avb-next branch:
git remote add --track open-avb-next upstream git://github.com/AVnu/OpenAvnu.git
note: You can use this command to test other people's pull requests by adding their pull request branches as remotes, fetching them, and checking them out.
Your own fork remote will be named "origin" unless you told git otherwise when you cloned.
Again, this is a one-time configuration change per clone.
3.4 Create a working branch on your local clone
Don't make changes directly to the remote tracking branches for upstream or your fork. This would make it difficult to cleanly deal with upstream changes that occur while your work is in progress.
Name your working branch after the topic you'll be working on; i.e. fix-issue-35 or add-feature-X. This is a note to yourself and others about what the change set you're developing is for, and a reminder not to commit unrelated changes in it.
For example, if you want to submit a change to open-avb-next
, you
would first make sure your fork and local clone are up-to-date, then
issue:
git checkout open-avb-next git checkout -b fix_issue_323
This new local branch is where you will do all your work. Commit
regularly. If the work starts to collect a lot of minor changes, use
interactive rebases (git rebase -i
) to clean them up into a small,
meaningful patchset.
Keeping the local patchset trimmed regularly will make it easier to
apply changes from upstream, but doing this too aggressively could
hamper your ability to use tools like git bisect
to track down the
cause of regressions you introduce, so it's probably best to do it
after a round of testing.
3.5 Regularly fetch from upstream
This keeps your local tracking version of permanent branches up-to-date with upstream. Also push these to your fork's permanent branches so they stay consistent.
To do this for the open-avb-next
branch:
git checkout open-avb-next git fetch upstream git pull --ff-only upstream git push origin open-avb-next
3.6 Regularly rebase your working branch to current upstream
Don't merge upstream changes to your private set of changes; this will just complicate the upstream history with unnecessary detail when your changes are accepted. Rebase instead.
Rebase rolls back to the branch point, applies upstream changes, then re-applies each of your changes to the new upstream head point that your branch will now diverge from.
This will allow your bugfix to be applied as a fast-forward merge to upstream, which keeps the history clean and linear while ensuring that you're still tracking upstream changes along with your own.
After you update your open-avb-next
branch:
git checkout issue-32-working git rebase open-avb-next
See git help rebase
for more information on the rebase command.
3.7 (optional) Push your in-progress branch to fork/master marked as private
If you want to ensure your in-progress work is backed-up or have others view it, push it to your forked repository but ensure that the branch name has "-private" or some other indicator in it that others should not attempt to make changes directly to it, as you may change the set of changes in it at any point due to rebasing:
git push -u --set-upstream origin <local branch name>
3.8 After your fix is finished/tested, do a final cleanup
Do a fetch/rebase of upstream/master and re-test if any changes occurred. Do an interactive rebase to compact your changes into a small, well-documented change set.
git rebase -i
See the INTERACTIVE MODE
section of git help rebase
. Note that you
can easily merge and fix up commit messages this way, but you can also
split apart commits and otherwise clean things up while the rebase is
in progress.
When the change set includes both test changes and code changes that make the tests pass, keep the test and code changes in separate commits and order them such that the tests can be applied first and verified to fail before the code changes that fix them are applied.
Include audit trail notes (issue references, links to other discussion, etc) along with a well-formed and detailed commit message.
http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html
3.9 Push your local branch to your github fork and initiate a pull request.
Initiate the push and create a remote tracking branch:
git push -u --set-upstream origin <local branch name>
Now go to the GitHub page for your fork, click on the "Pull Requests" tab, and then click on the green "New pull request" button. Make sure the two refs point at the repositories and branches you want, then click the "Create pull request" button.
This is the beginning of the code review process, and your contribution is now public, so you should not do any rebases during this period as they will complicate the process of reviewing your code.
If you already pushed your local changes to a private branch on your forked github repo, you can initiate a pull request from there, but you'll need to remember not to rebase while the review is in progress.
If your pull request branch was previously pushed but has been rebased
locally, a normal push
command will fail because the remote's update
won't be a fast-forward. Use the following command instead:
git push --force-with-lease
This will stop any non-fast-forward update to a remote branch where
someone has pushed changes that you don't have locally. If you've
followed good "branch hygiene" it shouldn't be a problem, but if your
git config has push.default = matching
a plain --force
could
potentially cause unintentional changes to your GitHub fork.
If you are unsure what will happen during a push
command, use the
-n
option to do a dry-run first.
3.10 (optional) Make another local branch from the pull request branch
This will be a private branch again, so you can freely do interactive rebases to clean up your history into a clean patch set. Do a fast-forward merge back into the pull request tracking branch when you're ready for another round of review and push it to your fork.
From the head of the pull request branch, just run:
git checkout -b <new branch name>
When you want to add them to the pull request, do any interactive rebasing you would like and then run:
git checkout <original pull request branch> git merge --ff-only <new branch name>
And if you're done with the new work branch:
git branch -d <new branch name>
3.11 (optional) Intermediate/final rebases for long-running pull requests
If you have to do a significant re-write of your change, or the branch your change applies to changes significantly during the review process, you may need to re-base your pull request branch onto the tip of the branch from which it started. This uses the same commands as rebasing before you issued the pull request.
When this operation is pushed to your forked repo on GitHub, it will update the pull request automatically. All comments referring to the previous history of the pull request branch will be noted as 'stale', though it will hold onto the old references if reviewers need to see what the old version of the code looked like.
You should be aware of anyone who has been tracking your pull request branch, as this semi-public rebasing can cause issues if others have made changes to local tracking branches of your pull request branch.
In turn, you should be aware that your own changes to tracking branches of the pull request branches of others could have their branch points removed. You'll need to cherry-pick your changes onto the new pull request after this happens.
3.12 After upstream merge, delete your change branches
There's no reason to keep them around; they are now part of upstream's development history and will show up in your repository via your upstream tracking branch. All this messing about with branches and rebasing serves to emulate what happens in a centralized VCS such as SVN, but it provides developers with a finer-grained control of their own patch-set development.
Delete a local branch (use -D
instead if the branch to delete is
not fully merged):
git branch -d <branch name>
Delete a local tracking branch (this will return during the next fetch if the upstream branch it tracks is still there):
git branch -rd <branch name>
Delete a remote branch (push to remote branch with an empty ref to the local branch, which normally goes before the colon):
git push <repository> :<branch name>
4 Task-based collaboration branch workflow
<to be written>
5 Misc git tips
5.1 Using git's "reuse recorded resolution" feature
When working with multiple repositories and performing a lot of merge and rebase operations, you may find yourself having to resolve essentially the same merge conflict over and over again. Fortunately, git provides a tool to record the resolutions you supply and automatically replay them when the conflict matches one with a recorded resolution.
To enable this, run the following command:
git config --global rerere.enabled true
See the following page, along with git help rerere
, for more details:
5.2 Splitting a commit
If you are cleaning up a working branch in preparation for a pull request, you may find that you mixed multiple logical changes into a single commit. Fortunately, it is not too difficult to break it apart and group changes in a more sensible order. In fact, it is sometimes easier to refactor a particularly messy branch by first squashing it into a single commit and then breaking that single commit into a logical set of changes.
The first step is to find the commit before the one you wish to
split. You can either look it up in the log, or if you have a ref
already for the one to split, you can use the ^
syntax to refer to
the previous one from it. Start an interactive rebase:
git rebase -i ref^
or
git rebase -i prevref
Now mark the commit to split with "edit" in your editor, save and quit. The edit operation of the rebase will begin, and the changed files for the commit will be staged in the index.
The next thing to do is to un-stage the changed files via the reset
command, which rolls the checked-out ref back to the version you
specify without changing the modified files in the working
directory:
git reset HEAD^
Now all the changes for the commit you are editing are in the list of unstaged but modified files. Just select the first set of edits for your new first commit, then stage and commit them. Repeat until the desired change set has been built and the working directory is clean, then continue the rebase operation:
git rebase --continue
5.3 Adding only some of a file's changes to the index
Sometimes you may find that a single file ends up with edits that belong to two or more distinct changes. Fortunately, you can stage those edits and commit them separately.
The first method is to use the --patch
option to git add
:
git add --patch <filename>
This will present the diff for the file to you one hunk at a time and give you a list of options for each one. You can choose to add the hunk, not add it, skip all remaining hunks, split the hunk into two (if there are unchanged lines between edits in the hunk) and edit the patch for the hunk directly.
An alternative is to use the --interactive
option:
git add --interactive [files]
This gives you a menu more suited to browsing and teasing apart the
changes to an entire repository rather than one file. See the man page
on git add
for more information.
Finally, the git gui
interface provides a very convenient way to
browse changed files and selectively stage files, hunks, or even
ranges of line-level changes. Clicking on a file icon in the "Unstaged
Changes" list will stage an entire file. Clicking on the file name
will view the diff for the file in the large right-hand pane. There
you can right-click on individual hunks or lines and select the option
to stage either one; alternatively you can select a range of lines
with the left mouse button and then right click for the option to
stage the selected range.