gitnotes

1. Git repository philosophy
2. Maintainer vs. Contributor Overhead
3. Short-term change workflow, no remote collaboration planned
4. Task-based collaboration branch workflow
5. Misc git tips

1 Git repository philosophy

The main repository should contain a mostly linear history composed of complete, well-tested, well-described changes that reflect relatively cohesive units of change. Any branch/merge points should be for long-running and well-defined units of work, and should similarly be mostly linear and have high-quality commits.

This makes it easy to see how development has progressed over time and what work was involved in creating new features and fixing particular issues. It also makes it easy to pull complete change-sets for features or issue fixes from one branch to another. This will be valuable to downstream maintainers of stable code lines for their products.

On the other hand, software development is sometimes a messy process and a version control system is invaluable to managing the daily process of changing a code base. Engineers should not wait for the "perfect change set" before committing changes. Fortunately, git provides mechanisms to support both aspects of software change management, though it requires application of a bit of discipline in the way git commands are applied to everyday work vs. merging complete changes with the official repository.

The dividing line falls between "public change history" vs. "private change history". Private history management is fluid and relatively un-structured, but maintained separately from public history in forked repos and local repo clones. It should involve numerous task-specific branches and should be viewed as a cooperative process between git and the user in developing a coherent change set for upstream. Individual commits are mostly unimportant in the long run; the only thing that matters is that the end result is a coherent, well-described and well-tested change to submit upstream.

Once a change is accepted upstream, however, it becomes a permanent record of the change it embodies. Every effort should be taken to make it correct and useful as an individual commit, because it will be forever a part of the repository history. No change should leave things in an inconsistent state, even if it only encompasses one aspect of a related set of changes. These kinds of commits don't just happen; they have to be carefully organized from the chaos of private branch history.

Linus's take on it:

http://thread.gmane.org/gmane.comp.video.dri.devel/34739/focus=34744

Articles on mechanics of merge and rebase:

http://www.derekgourlay.com/archives/428

http://mislav.uniqpath.com/2013/02/merge-vs-rebase/

https://medium.com/@porteneuve/getting-solid-at-git-rebase-vs-merge-4fa1a48c53aa

http://codeinthehole.com/writing/pull-requests-and-other-good-practices-for-teams-using-github/

2 Maintainer vs. Contributor Overhead

Plenty of potential maintainers and contributors are not yet familiar with using this style of distributed change management, and there is definitely some overhead involved in becoming familiar with using git this way. We should view the process of accepting contributions as a collaborative one in which we make an effort to guide maintainers and collaborators in learning how to prepare good pull requests.

When contributors are experienced and/or willing to learn proper procedures, we should politely suggest how to change their pull request to remove extraneous intermediate commits, merges, etc. at the beginning of the review process, as doing the required rebase will effectively end the pull request tracking. If the contributor doesn't have time or is unwilling, then we should again politely inform them why the submission can't be accepted in its current form and reject it.

If the change is otherwise good and provides an important needed feature or bug-fix, a maintainer should note that they'll be cherry-picking the change with attribution to the contributor but that it can't be merged as-is. The maintainer will have to manually fetch the pull request and rework it into a good change set before merging, then close the pull request with a note that it was merged manually.

Lower priority changes should just be closed after polite suggestions on how to re-work them into an acceptable form.

3 Short-term change workflow, no remote collaboration planned

This workflow applies to small bugfixes, small-scope changes, single-user contributions. etc. Although it may seem a little complicated, it really represents a model that treats upstream similarly to a centralized VCS while giving you the power to manage your own development with whatever version control practices suit you.

https://guides.github.com/activities/contributing-to-open-source/

3.1 Fork repository to own github account

This will be the public staging area for your contributions, and you'll create a branch here when you're ready for code review. This need only be done once per contributor/organization; you just need a github-managed clone you have write permission on from which to start pull requests.

3.2 Create local clone of your github fork

This will be the private staging area for your contributions. Private changes can be re-written by you at will, as no one else will be depending on the repository history structure. This only needs to be done once per workstation, of course.

git clone <your cloned repo>

3.3 Add main repository as a remote for your local clone

This will allow you to fetch updates from the upstream branch you're working on and also to push them to your remote fork from a single local work area.

The following creates a new remote named "upstream" that points to the main repo's open-avb-next branch:

git remote add --track open-avb-next upstream git://github.com/AVnu/OpenAvnu.git

note: You can use this command to test other people's pull requests by adding their pull request branches as remotes, fetching them, and checking them out.

Your own fork remote will be named "origin" unless you told git otherwise when you cloned.

Again, this is a one-time configuration change per clone.

3.4 Create a working branch on your local clone

Don't make changes directly to the remote tracking branches for upstream or your fork. This would make it difficult to cleanly deal with upstream changes that occur while your work is in progress.

Name your working branch after the topic you'll be working on; i.e. fix-issue-35 or add-feature-X. This is a note to yourself and others about what the change set you're developing is for, and a reminder not to commit unrelated changes in it.

For example, if you want to submit a change to open-avb-next, you would first make sure your fork and local clone are up-to-date, then issue:

git checkout open-avb-next
git checkout -b fix_issue_323

This new local branch is where you will do all your work. Commit regularly. If the work starts to collect a lot of minor changes, use interactive rebases (git rebase -i) to clean them up into a small, meaningful patchset.

Keeping the local patchset trimmed regularly will make it easier to apply changes from upstream, but doing this too aggressively could hamper your ability to use tools like git bisect to track down the cause of regressions you introduce, so it's probably best to do it after a round of testing.

3.5 Regularly fetch from upstream

This keeps your local tracking version of permanent branches up-to-date with upstream. Also push these to your fork's permanent branches so they stay consistent.

To do this for the open-avb-next branch:

git checkout open-avb-next
git fetch upstream
git pull --ff-only upstream
git push origin open-avb-next

3.6 Regularly rebase your working branch to current upstream

Don't merge upstream changes to your private set of changes; this will just complicate the upstream history with unnecessary detail when your changes are accepted. Rebase instead.

Rebase rolls back to the branch point, applies upstream changes, then re-applies each of your changes to the new upstream head point that your branch will now diverge from.

This will allow your bugfix to be applied as a fast-forward merge to upstream, which keeps the history clean and linear while ensuring that you're still tracking upstream changes along with your own.

After you update your open-avb-next branch:

git checkout issue-32-working
git rebase open-avb-next

See git help rebase for more information on the rebase command.

3.7 (optional) Push your in-progress branch to fork/master marked as private

If you want to ensure your in-progress work is backed-up or have others view it, push it to your forked repository but ensure that the branch name has "-private" or some other indicator in it that others should not attempt to make changes directly to it, as you may change the set of changes in it at any point due to rebasing:

git push -u --set-upstream origin <local branch name>

3.8 After your fix is finished/tested, do a final cleanup

Do a fetch/rebase of upstream/master and re-test if any changes occurred. Do an interactive rebase to compact your changes into a small, well-documented change set.

git rebase -i

See the INTERACTIVE MODE section of git help rebase. Note that you can easily merge and fix up commit messages this way, but you can also split apart commits and otherwise clean things up while the rebase is in progress.

When the change set includes both test changes and code changes that make the tests pass, keep the test and code changes in separate commits and order them such that the tests can be applied first and verified to fail before the code changes that fix them are applied.

Include audit trail notes (issue references, links to other discussion, etc) along with a well-formed and detailed commit message.

http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html

3.9 Push your local branch to your github fork and initiate a pull request.

Initiate the push and create a remote tracking branch:

git push -u --set-upstream origin <local branch name>

Now go to the GitHub page for your fork, click on the "Pull Requests" tab, and then click on the green "New pull request" button. Make sure the two refs point at the repositories and branches you want, then click the "Create pull request" button.

This is the beginning of the code review process, and your contribution is now public, so you should not do any rebases during this period as they will complicate the process of reviewing your code.

If you already pushed your local changes to a private branch on your forked github repo, you can initiate a pull request from there, but you'll need to remember not to rebase while the review is in progress.

If your pull request branch was previously pushed but has been rebased locally, a normal push command will fail because the remote's update won't be a fast-forward. Use the following command instead:

git push --force-with-lease

This will stop any non-fast-forward update to a remote branch where someone has pushed changes that you don't have locally. If you've followed good "branch hygiene" it shouldn't be a problem, but if your git config has push.default = matching a plain --force could potentially cause unintentional changes to your GitHub fork.

If you are unsure what will happen during a push command, use the -n option to do a dry-run first.

3.10 (optional) Make another local branch from the pull request branch

This will be a private branch again, so you can freely do interactive rebases to clean up your history into a clean patch set. Do a fast-forward merge back into the pull request tracking branch when you're ready for another round of review and push it to your fork.

From the head of the pull request branch, just run:

git checkout -b <new branch name>

When you want to add them to the pull request, do any interactive rebasing you would like and then run:

git checkout <original pull request branch>
git merge --ff-only <new branch name>

And if you're done with the new work branch:

git branch -d <new branch name>

3.11 (optional) Intermediate/final rebases for long-running pull requests

If you have to do a significant re-write of your change, or the branch your change applies to changes significantly during the review process, you may need to re-base your pull request branch onto the tip of the branch from which it started. This uses the same commands as rebasing before you issued the pull request.

When this operation is pushed to your forked repo on GitHub, it will update the pull request automatically. All comments referring to the previous history of the pull request branch will be noted as 'stale', though it will hold onto the old references if reviewers need to see what the old version of the code looked like.

You should be aware of anyone who has been tracking your pull request branch, as this semi-public rebasing can cause issues if others have made changes to local tracking branches of your pull request branch.

In turn, you should be aware that your own changes to tracking branches of the pull request branches of others could have their branch points removed. You'll need to cherry-pick your changes onto the new pull request after this happens.

3.12 After upstream merge, delete your change branches

There's no reason to keep them around; they are now part of upstream's development history and will show up in your repository via your upstream tracking branch. All this messing about with branches and rebasing serves to emulate what happens in a centralized VCS such as SVN, but it provides developers with a finer-grained control of their own patch-set development.

Delete a local branch (use -D instead if the branch to delete is not fully merged):

git branch -d <branch name>

Delete a local tracking branch (this will return during the next fetch if the upstream branch it tracks is still there):

git branch -rd <branch name>

Delete a remote branch (push to remote branch with an empty ref to the local branch, which normally goes before the colon):

git push <repository> :<branch name>

4 Task-based collaboration branch workflow

5 Misc git tips

5.1 Using git's "reuse recorded resolution" feature

When working with multiple repositories and performing a lot of merge and rebase operations, you may find yourself having to resolve essentially the same merge conflict over and over again. Fortunately, git provides a tool to record the resolutions you supply and automatically replay them when the conflict matches one with a recorded resolution.

To enable this, run the following command:

git config --global rerere.enabled true

See the following page, along with git help rerere, for more details:

http://git-scm.com/2010/03/03/rerere.htm

5.2 Splitting a commit

If you are cleaning up a working branch in preparation for a pull request, you may find that you mixed multiple logical changes into a single commit. Fortunately, it is not too difficult to break it apart and group changes in a more sensible order. In fact, it is sometimes easier to refactor a particularly messy branch by first squashing it into a single commit and then breaking that single commit into a logical set of changes.

The first step is to find the commit before the one you wish to split. You can either look it up in the log, or if you have a ref already for the one to split, you can use the ^ syntax to refer to the previous one from it. Start an interactive rebase:

git rebase -i ref^

git rebase -i prevref

Now mark the commit to split with "edit" in your editor, save and quit. The edit operation of the rebase will begin, and the changed files for the commit will be staged in the index.

The next thing to do is to un-stage the changed files via the reset command, which rolls the checked-out ref back to the version you specify without changing the modified files in the working directory:

git reset HEAD^

Now all the changes for the commit you are editing are in the list of unstaged but modified files. Just select the first set of edits for your new first commit, then stage and commit them. Repeat until the desired change set has been built and the working directory is clean, then continue the rebase operation:

git rebase --continue

5.3 Adding only some of a file's changes to the index

Sometimes you may find that a single file ends up with edits that belong to two or more distinct changes. Fortunately, you can stage those edits and commit them separately.

The first method is to use the --patch option to git add:

git add --patch <filename>

This will present the diff for the file to you one hunk at a time and give you a list of options for each one. You can choose to add the hunk, not add it, skip all remaining hunks, split the hunk into two (if there are unchanged lines between edits in the hunk) and edit the patch for the hunk directly.

An alternative is to use the --interactive option:

git add --interactive [files]

This gives you a menu more suited to browsing and teasing apart the changes to an entire repository rather than one file. See the man page on git add for more information.

Finally, the git gui interface provides a very convenient way to browse changed files and selectively stage files, hunks, or even ranges of line-level changes. Clicking on a file icon in the "Unstaged Changes" list will stage an entire file. Clicking on the file name will view the diff for the file in the large right-hand pane. There you can right-click on individual hunks or lines and select the option to stage either one; alternatively you can select a range of lines with the left mouse button and then right click for the option to stage the selected range.

5.4 Links to other tips

http://mislav.uniqpath.com/2010/07/git-tips/