summaryrefslogtreecommitdiff
path: root/Documentation/tutorial.txt
diff options
context:
space:
mode:
authorJunio C Hamano <junkio@cox.net>2005-07-15 11:40:56 -0700
committerLinus Torvalds <torvalds@g5.osdl.org>2005-07-15 12:08:15 -0700
commit3eb5128a108d59be350ce7f43d1579d588158430 (patch)
tree9c5b5a0d6f5985f8dcd7789ea07d0d908e2d9556 /Documentation/tutorial.txt
parente7c1ca4273e3306760f220ea1e6c4685aa7a8b92 (diff)
downloadgit-3eb5128a108d59be350ce7f43d1579d588158430.tar.gz
[PATCH] Documentation: pull, push, packing repository and working with others.
Describe where you can pull from with a bit more detail. Clarify description of pushing. Add a section on packing repositories. Add a section on recommended workflow for the project lead, subsystem maintainers and individual developers. Move "Tag" section around to make the flow of example simpler to follow. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'Documentation/tutorial.txt')
-rw-r--r--Documentation/tutorial.txt357
1 files changed, 283 insertions, 74 deletions
diff --git a/Documentation/tutorial.txt b/Documentation/tutorial.txt
index 50ac927186..b3f30ae7c5 100644
--- a/Documentation/tutorial.txt
+++ b/Documentation/tutorial.txt
@@ -453,6 +453,55 @@ With that, you should now be having some inkling of what git does, and
can explore on your own.
+[ Side note: most likely, you are not directly using the core
+ git Plumbing commands, but using Porcelain like Cogito on top
+ of it. Cogito works a bit differently and you usually do not
+ have to run "git-update-cache" yourself for changed files (you
+ do tell underlying git about additions and removals via
+ "cg-add" and "cg-rm" commands). Just before you make a commit
+ with "cg-commit", Cogito figures out which files you modified,
+ and runs "git-update-cache" on them for you. ]
+
+
+ Tagging a version
+ -----------------
+
+In git, there's two kinds of tags, a "light" one, and a "signed tag".
+
+A "light" tag is technically nothing more than a branch, except we put
+it in the ".git/refs/tags/" subdirectory instead of calling it a "head".
+So the simplest form of tag involves nothing more than
+
+ cat .git/HEAD > .git/refs/tags/my-first-tag
+
+after which point you can use this symbolic name for that particular
+state. You can, for example, do
+
+ git diff my-first-tag
+
+to diff your current state against that tag (which at this point will
+obviously be an empty diff, but if you continue to develop and commit
+stuff, you can use your tag as a "anchor-point" to see what has changed
+since you tagged it.
+
+A "signed tag" is actually a real git object, and contains not only a
+pointer to the state you want to tag, but also a small tag name and
+message, along with a PGP signature that says that yes, you really did
+that tag. You create these signed tags with
+
+ git tag <tagname>
+
+which will sign the current HEAD (but you can also give it another
+argument that specifies the thing to tag, ie you could have tagged the
+current "mybranch" point by using "git tag <tagname> mybranch").
+
+You normally only do signed tags for major releases or things
+like that, while the light-weight tags are useful for any marking you
+want to do - any time you decide that you want to remember a certain
+point, just create a private tag for it, and you have a nice symbolic
+name for the state at that point.
+
+
Copying archives
-----------------
@@ -729,117 +778,277 @@ simply do
and optionally give a branch-name for the remote end as a second
argument.
-[ Todo: fill in real examples ]
+The "remote" repository can even be on the same machine. One of
+the following notations can be used to name the repository to
+pull from:
+ Rsync URL
+ rsync://remote.machine/path/to/repo.git/
- Tagging a version
- -----------------
+ HTTP(s) URL
+ http://remote.machine/path/to/repo.git/
-In git, there's two kinds of tags, a "light" one, and a "signed tag".
+ GIT URL
+ git://remote.machine/path/to/repo.git/
+ remote.machine:/path/to/repo.git/
-A "light" tag is technically nothing more than a branch, except we put
-it in the ".git/refs/tags/" subdirectory instead of calling it a "head".
-So the simplest form of tag involves nothing more than
+ Local directory
+ /path/to/repo.git/
- cat .git/HEAD > .git/refs/tags/my-first-tag
+[ Side Note: currently, HTTP transport is slightly broken in
+ that when the remote repository is "packed" they do not always
+ work. But we have not talked about packing repository yet, so
+ let's not worry too much about it for now. ]
-after which point you can use this symbolic name for that particular
-state. You can, for example, do
-
- git diff my-first-tag
-
-to diff your current state against that tag (which at this point will
-obviously be an empty diff, but if you continue to develop and commit
-stuff, you can use your tag as a "anchor-point" to see what has changed
-since you tagged it.
-
-A "signed tag" is actually a real git object, and contains not only a
-pointer to the state you want to tag, but also a small tag name and
-message, along with a PGP signature that says that yes, you really did
-that tag. You create these signed tags with
-
- git tag <tagname>
-
-which will sign the current HEAD (but you can also give it another
-argument that specifies the thing to tag, ie you could have tagged the
-current "mybranch" point by using "git tag <tagname> mybranch").
-
-You normally only do signed tags for major releases or things
-like that, while the light-weight tags are useful for any marking you
-want to do - any time you decide that you want to remember a certain
-point, just create a private tag for it, and you have a nice symbolic
-name for the state at that point.
+[ Digression: you could do without using any branches at all, by
+ keeping as many local repositories as you would like to have
+ branches, and merging between them with "git pull", just like
+ you merge between branches. The advantage of this approach is
+ that it lets you keep set of files for each "branch" checked
+ out and you may find it easier to switch back and forth if you
+ juggle multiple lines of development simultaneously. Of
+ course, you will pay the price of more disk usage to hold
+ multiple working trees, but disk space is cheap these days. ]
Publishing your work
--------------------
-We already talked about using somebody else's work from a remote
-repository, in the "merging external work" section. It involved
-fetching the work from a remote repository; but how would _you_
-prepare a repository so that other people can fetch from it?
+So we can use somebody else's work from a remote repository; but
+how can _you_ prepare a repository to let other people pull from
+it?
-Your real work happens in your working directory with your
+Your do your real work in your working directory that has your
primary repository hanging under it as its ".git" subdirectory.
-You _could_ make it accessible remotely and ask people to pull
-from it, but in practice that is not the way things are usually
-done. A recommended way is to have a public repository, make it
-reachable by other people, and when the changes you made in your
-primary working directory are in good shape, update the public
-repository with it.
+You _could_ make that repository accessible remotely and ask
+people to pull from it, but in practice that is not the way
+things are usually done. A recommended way is to have a public
+repository, make it reachable by other people, and when the
+changes you made in your primary working directory are in good
+shape, update the public repository from it. This is often
+called "pushing".
[ Side note: this public repository could further be mirrored,
and that is how kernel.org git repositories are done. ]
-Publishing the changes from your private repository to your
-public repository requires you to have write privilege on the
-machine that hosts your public repository, and it is internally
-done via an SSH connection.
+Publishing the changes from your local (private) repository to
+your remote (public) repository requires a write privilege on
+the remote machine. You need to have an SSH account there to
+run a single command, "git-receive-pack".
-First, you need to create an empty repository to push to on the
-machine that houses your public repository. This needs to be
+First, you need to create an empty repository on the remote
+machine that will house your public repository. This empty
+repository will be populated and be kept up-to-date by pushing
+into it later. Obviously, this repository creation needs to be
done only once.
+[ Digression: "git push" uses a pair of programs,
+ "git-send-pack" on your local machine, and "git-receive-pack"
+ on the remote machine. The communication between the two over
+ the network internally uses an SSH connection. ]
+
Your private repository's GIT directory is usually .git, but
-often your public repository is named "<projectname>.git".
-Let's create such a public repository for project "my-git".
-After logging into the remote machine, create an empty
-directory:
+your public repository is often named after the project name,
+i.e. "<project>.git". Let's create such a public repository for
+project "my-git". After logging into the remote machine, create
+an empty directory:
mkdir my-git.git
-Then, initialize that directory with git-init-db, but this time,
-since it's name is not usual ".git", we do things a bit
-differently:
+Then, make that directory into a GIT repository by running
+git-init-db, but this time, since it's name is not the usual
+".git", we do things slightly differently:
GIT_DIR=my-git.git git-init-db
Make sure this directory is available for others you want your
-changes to be pulled by. Also make sure that you have the
-'git-receive-pack' program on the $PATH.
-
-[ Side note: many installations of sshd does not invoke your
- shell as the login shell when you directly run programs; what
- this means is that if your login shell is bash, only .bashrc
- is read bypassing .bash_profile. As a workaround, make sure
- .bashrc sets up $PATH so that 'git-receive-pack' program can
- be run. ]
-
-Your 'public repository' is ready to accept your changes. Now,
-come back to the machine you have your private repository. From
+changes to be pulled by via the transport of your choice. Also
+you need to make sure that you have the "git-receive-pack"
+program on the $PATH.
+
+[ Side note: many installations of sshd do not invoke your shell
+ as the login shell when you directly run programs; what this
+ means is that if your login shell is bash, only .bashrc is
+ read and not .bash_profile. As a workaround, make sure
+ .bashrc sets up $PATH so that you can run 'git-receive-pack'
+ program. ]
+
+Your "public repository" is now ready to accept your changes.
+Come back to the machine you have your private repository. From
there, run this command:
git push <public-host>:/path/to/my-git.git master
This synchronizes your public repository to match the named
-branch head (i.e. refs/heads/master in this case) and objects
-reachable from them in your current repository.
+branch head (i.e. "master" in this case) and objects reachable
+from them in your current repository.
As a real example, this is how I update my public git
repository. Kernel.org mirror network takes care of the
-propagation to other publically visible machines:
+propagation to other publicly visible machines:
git push master.kernel.org:/pub/scm/git/git.git/
-[ to be continued.. cvsimports, pushing and pulling ]
+[ Digression: your GIT "public" repository people can pull from
+ is different from a public CVS repository that lets read-write
+ access to multiple developers. It is a copy of _your_ primary
+ repository published for others to use, and you should not
+ push into it from more than one repository (this means, not
+ just disallowing other developers to push into it, but also
+ you should push into it from a single repository of yours).
+ Sharing the result of work done by multiple people are always
+ done by pulling (i.e. fetching and merging) from public
+ repositories of those people. Typically this is done by the
+ "project lead" person, and the resulting repository is
+ published as the public repository of the "project lead" for
+ everybody to base further changes on. ]
+
+
+ Packing your repository
+ -----------------------
+
+Earlier, we saw that one file under .git/objects/??/ directory
+is stored for each git object you create. This representation
+is convenient and efficient to create atomically and safely, but
+not so to transport over the network. Since git objects are
+immutable once they are created, there is a way to optimize the
+storage by "packing them together". The command
+
+ git repack
+
+will do it for you. If you followed the tutorial examples, you
+would have accumulated about 17 objects in .git/objects/??/
+directories by now. "git repack" tells you how many objects it
+packed, and stores the packed file in .git/objects/pack
+directory.
+
+[ Side Note: you will see two files, pack-*.pack and pack-*.idx,
+ in .git/objects/pack directory. They are closely related to
+ each other, and if you ever copy them by hand to a different
+ repository for whatever reason, you should make sure you copy
+ them together. The former holds all the data from the objects
+ in the pack, and the latter holds the index for random
+ access. ]
+
+If you are paranoid, running "git-verify-pack" command would
+detect if you have a corrupt pack, but do not worry too much.
+Our programs are always perfect ;-).
+
+Once you have packed objects, you do not need to leave the
+unpacked objects that are contained in the pack file anymore.
+
+ git prune-packed
+
+would remove them for you.
+
+You can try running "find .git/objects -type f" before and after
+you run "git prune-packed" if you are curious.
+
+[ Side Note: as we already mentioned, "git pull" is broken for
+ some transports dealing with packed repositories right now, so
+ do not run "git prune-packed" if you plan to give "git pull"
+ access via HTTP transport for now. ]
+
+If you run "git repack" again at this point, it will say
+"Nothing to pack". Once you continue your development and
+accumulate the changes, running "git repack" again will create a
+new pack, that contains objects created since you packed your
+archive the last time. We recommend that you pack your project
+soon after the initial import (unless you are starting your
+project from scratch), and then run "git repack" every once in a
+while, depending on how active your project is.
+
+When a repository is synchronized via "git push" and "git pull",
+objects packed in the source repository is usually stored
+unpacked in the destination, unless rsync transport is used.
+
+
+ Working with Others
+ -------------------
+
+A recommended work cycle for a "project lead" is like this:
+
+ (1) Prepare your primary repository on your local machine. Your
+ work is done there.
+
+ (2) Prepare a public repository accessible to others.
+
+ (3) Push into the public repository from your primary
+ repository.
+
+ (4) "git repack" the public repository. This establishes a big
+ pack that contains the initial set of objects.
+
+ (5) Keep working in your primary repository, and push your
+ changes to the public repository. Your changes include
+ your own, patches you receive via e-mail, and merge resulting
+ from pulling the "public" repositories of your "subsystem
+ maintainers".
+
+ You can repack this private repository whenever you feel
+ like.
+
+ (6) Every once in a while, "git repack" the public repository.
+ Go back to step (5) and continue working.
+
+A recommended work cycle for a "subsystem maintainer" that
+works on that project and has own "public repository" is like
+this:
+
+ (1) Prepare your work repository, by "git clone" the public
+ repository of the "project lead".
+
+ (2) Prepare a public repository accessible to others.
+
+ (3) Copy over the packed files from "project lead" public
+ repository to your public repository by hand; this part is
+ currently not automated.
+
+ (4) Push into the public repository from your primary
+ repository.
+
+ (5) Keep working in your primary repository, and push your
+ changes to your public repository, and ask your "project
+ lead" to pull from it. Your changes include your own,
+ patches you receive via e-mail, and merge resulting from
+ pulling the "public" repositories of your "project lead"
+ and possibly your "sub-subsystem maintainers".
+
+ You can repack this private repository whenever you feel
+ like.
+
+ (6) Every once in a while, "git repack" the public repository.
+ Go back to step (5) and continue working.
+
+A recommended work cycle for an "individual developer" who does
+not have a "public" repository is somewhat different. It goes
+like this:
+
+ (1) Prepare your work repositories, by "git clone" the public
+ repository of the "project lead" (or "subsystem
+ maintainer", if you work on a subsystem).
+
+ (2) Copy .git/refs/master to .git/refs/upstream.
+
+ (3) Do your work there. Make commits.
+
+ (4) Run "git fetch" from the public repository of your upstream
+ every once in a while. This does only the first half of
+ "git pull" but does not merge. The head of the public
+ repository is stored in .git/FETCH_HEAD. Copy it in
+ .git/refs/heads/upstream.
+
+ (5) Use "git cherry" to see which ones of your patches were
+ accepted, and/or use "git rebase" to port your unmerged
+ changes forward to the updated upstream.
+
+ (6) Use "git format-patch upstream" to prepare patches for
+ e-mail submission to your upstream and send it out.
+ Go back to step (3) and continue.
+
+[Side Note: I think Cogito calls this upstream "origin".
+ Somebody care to confirm or deny? ]
+
+
+[ to be continued.. cvsimports ]