summaryrefslogtreecommitdiff
path: root/doc/tutorial.rst
diff options
context:
space:
mode:
Diffstat (limited to 'doc/tutorial.rst')
-rw-r--r--doc/tutorial.rst415
1 files changed, 295 insertions, 120 deletions
diff --git a/doc/tutorial.rst b/doc/tutorial.rst
index 838fd68e..695e9812 100644
--- a/doc/tutorial.rst
+++ b/doc/tutorial.rst
@@ -4,9 +4,8 @@
GitPython Tutorial
==================
-GitPython provides object model access to your git repository. Once you have
-created a repository object, you can traverse it to find parent commit(s),
-trees, blobs, etc.
+GitPython provides object model access to your git repository. This tutorial is
+composed of multiple sections, each of which explain a real-life usecase.
Initialize a Repo object
************************
@@ -21,84 +20,185 @@ is my working repository and contains the ``.git`` directory. You can also
initialize GitPython with a bare repository.
>>> repo = Repo.create("/var/git/git-python.git")
+
+A repo object provides high-level access to your data, it allows you to create
+and delete heads, tags and remotes and access the configuration of the
+repository.
+
+ >>> repo.config_reader() # get a config reader for read-only access
+ >>> repo.config_writer() # get a config writer to change configuration
+
+Query the active branch, query untracked files or whether the repository data
+has been modified.
+
+ >>> repo.is_dirty()
+ False
+ >>> repo.untracked_files()
+ ['my_untracked_file']
+
+Clone from existing repositories or initialize new empty ones.
+
+ >>> cloned_repo = repo.clone("to/this/path")
+ >>> new_repo = repo.init("path/for/new/repo")
+
+Archive the repository contents to a tar file.
+
+ >>> repo.archive(open("repo.tar",'w'))
+
+Examining References
+********************
+
+References are the tips of your commit graph from which you can easily examine
+the history of your project.
+
+ >>> heads = repo.heads
+ >>> master = heads.master # lists can be accessed by name for convenience
+ >>> master.commit # the commit pointed to by head called master
+ >>> master.rename("new_name") # rename individual heads or
+
+Tags are (usually immutable) references to a commit and/or a tag object.
+
+ >>> tags = repo.tags
+ >>> tagref = tags[0]
+ >>> tagref.tag # tags may have tag objects carrying additional information
+ >>> tagref.commit # but they always point to commits
+ >>> repo.delete_tag(tagref) # delete or
+ >>> repo.create_tag("my_tag") # create tags using the repo
+
+A symbolic reference is a special case of a reference as it points to another
+reference instead of a commit
+
+Modifying References
+********************
+You can easily create and delete reference types or modify where they point to.
+
+ >>> repo.delete_head('master')
+ >>> master = repo.create_head('master')
+ >>> master.commit = 'HEAD~10' # set another commit without changing index or working tree
+
+Create or delete tags the same way except you may not change them afterwards
+
+ >>> new_tag = repo.create_tag('my_tag', 'my message')
+ >>> repo.delete_tag(new_tag)
+
+Change the symbolic reference to switch branches cheaply ( without adjusting the index
+or the working copy )
+
+ >>> new_branch = repo.create_head('new_branch')
+ >>> repo.head.reference = new_branch
+
+Understanding Objects
+*********************
+An Object is anything storable in gits object database. Objects contain information
+about their type, their uncompressed size as well as their data. Each object is
+uniquely identified by a SHA1 hash, being 40 hexadecimal characters in size.
+
+Git only knows 4 distinct object types being Blobs, Trees, Commits and Tags.
+
+In Git-Pyhton, all objects can be accessed through their common base, compared
+and hashed, as shown in the following example.
+
+ >>> hc = repo.head.commit
+ >>> hct = hc.tree
+ >>> hc != hct
+ >>> hc != repo.tags[0]
+ >>> hc == repo.head.reference.commit
+
+Basic fields are
+
+ >>> hct.type
+ 'tree'
+ >>> hct.size
+ 166
+ >>> hct.sha
+ 'a95eeb2a7082212c197cabbf2539185ec74ed0e8'
+ >>> hct.data # returns string with pure uncompressed data
+ '...'
+ >>> len(hct.data) == hct.size
+
+Index Objects are objects that can be put into gits index. These objects are trees
+and blobs which additionally know about their path in the filesystem as well as their
+mode.
+
+ >>> hct.path # root tree has no path
+ ''
+ >>> hct.trees[0].path # the first subdirectory has one though
+ 'dir'
+ >>> htc.mode # trees have mode 0
+ 0
+ >>> '%o' % htc.blobs[0].mode # blobs have a specific mode though comparable to a standard linux fs
+ 100644
+
+Access blob data (or any object data) directly or using streams.
+ >>> htc.data # binary tree data
+ >>> htc.blobs[0].data_stream # stream object to read data from
+ >>> htc.blobs[0].stream_data(my_stream) # write data to given stream
+
+
+The Commit object
+*****************
-Getting a list of commits
-*************************
-
-From the ``Repo`` object, you can get a list of ``Commit``
-objects.
-
- >>> repo.commits()
- [<git.Commit "207c0c4418115df0d30820ab1a9acd2ea4bf4431">,
- <git.Commit "a91c45eee0b41bf3cdaad3418ca3850664c4a4b4">,
- <git.Commit "e17c7e11aed9e94d2159e549a99b966912ce1091">,
- <git.Commit "bd795df2d0e07d10e0298670005c0e9d9a5ed867">]
+Commit objects contain information about a specific commit. Obtain commits using
+references as done in 'Examining References' or as follows
-Called without arguments, ``Repo.commits`` returns a list of up to ten commits
-reachable by the master branch (starting at the latest commit). You can ask
-for commits beginning at a different branch, commit, tag, etc.
+Obtain commits at the specified revision:
- >>> repo.commits('mybranch')
- >>> repo.commits('40d3057d09a7a4d61059bca9dca5ae698de58cbe')
- >>> repo.commits('v0.1')
+ >>> repo.commit('master')
+ >>> repo.commit('v0.1')
+ >>> repo.commit('HEAD~10')
-You can specify the maximum number of commits to return.
+Iterate 100 commits
- >>> repo.commits('master', max_count=100)
+ >>> repo.iter_commits('master', max_count=100)
If you need paging, you can specify a number of commits to skip.
- >>> repo.commits('master', max_count=10, skip=20)
+ >>> repo.iter_commits('master', max_count=10, skip=20)
The above will return commits 21-30 from the commit list.
-The Commit object
-*****************
-
-Commit objects contain information about a specific commit.
+ >>> headcommit = repo.headcommit.commit
- >>> head = repo.commits()[0]
-
- >>> head.id
+ >>> headcommit.sha
'207c0c4418115df0d30820ab1a9acd2ea4bf4431'
- >>> head.parents
+ >>> headcommit.parents
[<git.Commit "a91c45eee0b41bf3cdaad3418ca3850664c4a4b4">]
- >>> head.tree
+ >>> headcommit.tree
<git.Tree "563413aedbeda425d8d9dcbb744247d0c3e8a0ac">
- >>> head.author
+ >>> headcommit.author
<git.Actor "Michael Trier <mtrier@gmail.com>">
- >>> head.authored_date
- (2008, 5, 7, 5, 0, 56, 2, 128, 0)
+ >>> headcommit.authored_date # seconds since epoch
+ 1256291446
- >>> head.committer
+ >>> headcommit.committer
<git.Actor "Michael Trier <mtrier@gmail.com>">
- >>> head.committed_date
- (2008, 5, 7, 5, 0, 56, 2, 128, 0)
+ >>> headcommit.committed_date
+ 1256291446
- >>> head.message
+ >>> headcommit.message
'cleaned up a lot of test information. Fixed escaping so it works with
subprocess.'
-Note: date time is represented in a `struct_time`_ format. Conversion to
+Note: date time is represented in a ``seconds since epock`` format. Conversion to
human readable form can be accomplished with the various time module methods.
>>> import time
- >>> time.asctime(head.committed_date)
+ >>> time.asctime(time.gmtime(headcommit.committed_date))
'Wed May 7 05:56:02 2008'
- >>> time.strftime("%a, %d %b %Y %H:%M", head.committed_date)
+ >>> time.strftime("%a, %d %b %Y %H:%M", time.gmtime(headcommit.committed_date))
'Wed, 7 May 2008 05:56'
.. _struct_time: http://docs.python.org/library/time.html
You can traverse a commit's ancestry by chaining calls to ``parents``.
- >>> repo.commits()[0].parents[0].parents[0].parents[0]
+ >>> headcommit.parents[0].parents[0].parents[0]
The above corresponds to ``master^^^`` or ``master~3`` in git parlance.
@@ -108,49 +208,37 @@ The Tree object
A tree records pointers to the contents of a directory. Let's say you want
the root tree of the latest commit on the master branch.
- >>> tree = repo.commits()[0].tree
+ >>> tree = repo.heads.master.commit.tree
<git.Tree "a006b5b1a8115185a228b7514cdcd46fed90dc92">
- >>> tree.id
+ >>> tree.sha
'a006b5b1a8115185a228b7514cdcd46fed90dc92'
Once you have a tree, you can get the contents.
- >>> contents = tree.values()
- [<git.Blob "6a91a439ea968bf2f5ce8bb1cd8ddf5bf2cad6c7">,
- <git.Blob "e69de29bb2d1d6434b8b29ae775ad8c2e48c5391">,
- <git.Tree "eaa0090ec96b054e425603480519e7cf587adfc3">,
- <git.Blob "980e72ae16b5378009ba5dfd6772b59fe7ccd2df">]
-
-The tree is implements a dictionary protocol so it can be used and acts just
-like a dictionary with some additional properties.
-
- >>> tree.items()
- [('lib', <git.Tree "310ebc9a0904531438bdde831fd6a27c6b6be58e">),
- ('LICENSE', <git.Blob "6797c1421052efe2ded9efdbb498b37aeae16415">),
- ('doc', <git.Tree "a58386dd101f6eb7f33499317e5508726dfd5e4f">),
- ('MANIFEST.in', <git.Blob "7da4e346bb0a682e99312c48a1f452796d3fb988">),
- ('.gitignore', <git.Blob "6870991011cc8d9853a7a8a6f02061512c6a8190">),
- ('test', <git.Tree "c6f6ee37d328987bc6fb47a33fed16c7886df857">),
- ('VERSION', <git.Blob "9faa1b7a7339db85692f91ad4b922554624a3ef7">),
- ('AUTHORS', <git.Blob "9f649ef5448f9666d78356a2f66ba07c5fb27229">),
- ('README', <git.Blob "9643dcf549f34fbd09503d4c941a5d04157570fe">),
- ('ez_setup.py', <git.Blob "3031ad0d119bd5010648cf8c038e2bbe21969ecb">),
- ('setup.py', <git.Blob "271074302aee04eb0394a4706c74f0c2eb504746">),
- ('CHANGES', <git.Blob "0d236f3d9f20d5e5db86daefe1e3ba1ce68e3a97">)]
-
-This tree contains three ``Blob`` objects and one ``Tree`` object. The trees
-are subdirectories and the blobs are files. Trees below the root have
-additional attributes.
-
- >>> contents = tree["lib"]
- <git.Tree "c1c7214dde86f76bc3e18806ac1f47c38b2b7a3">
-
- >>> contents.name
- 'test'
-
- >>> contents.mode
- '040000'
+ >>> tree.trees # trees are subdirectories
+ [<git.Tree "f7eb5df2e465ab621b1db3f5714850d6732cfed2">]
+
+ >>> tree.blobs # blobs are files
+ [<git.Blob "a871e79d59cf8488cac4af0c8f990b7a989e2b53">,
+ <git.Blob "3594e94c04db171e2767224db355f514b13715c5">,
+ <git.Blob "e79b05161e4836e5fbf197aeb52515753e8d6ab6">,
+ <git.Blob "94954abda49de8615a048f8d2e64b5de848e27a1">]
+
+Its useful to know that a tree behaves like a list with the ability to
+query entries by name.
+
+ >>> tree[0] == tree['dir']
+ <git.Tree "f7eb5df2e465ab621b1db3f5714850d6732cfed2">
+ >>> for entry in tree: do_something(entry)
+
+ >>> blob = tree[0][0]
+ >>> blob.name
+ 'file'
+ >>> blob.path
+ 'dir/file'
+ >>> blob.abspath
+ '/Users/mtrier/Development/git-python/dir/file'
There is a convenience method that allows you to get a named sub-object
from a tree with a syntax similar to how paths are written in an unix
@@ -166,46 +254,133 @@ You can also get a tree directly from the repository if you know its name.
>>> repo.tree("c1c7214dde86f76bc3e18806ac1f47c38b2b7a30")
<git.Tree "c1c7214dde86f76bc3e18806ac1f47c38b2b7a30">
+ >>> repo.tree('0.1.6')
+ <git.Tree "6825a94104164d9f0f5632607bebd2a32a3579e5">
+
+As trees only allow direct access to their direct entries, use the traverse
+method to obtain an iterator to access entries recursively.
+
+ >>> tree.traverse()
+ <generator object at 0x7f6598bd65a8>
+ >>> for entry in traverse(): do_something(entry)
+
+
+The Index Object
+****************
+The git index is the stage containing changes to be written to the next commit
+or where merges finally have to take place. You may freely access and manipulate
+this information using the Index Object.
+
+ >>> index = repo.index
+
+Access objects and add/remove entries. Commit the changes.
+
+ >>> for stage,blob in index.iter_blobs(): do_something(...)
+ Access blob objects
+ >>> for (path,stage),entry in index.entries.iteritems: pass
+ Access the entries directly
+ >>> index.add(['my_new_file']) # add a new file to the index
+ >>> index.remove(['dir/existing_file'])
+ >>> new_commit = index.commit("my commit message")
+
+Create new indices from other trees or as result of a merge. Write that result to
+a new index.
+
+ >>> tmp_index = Index.from_tree(repo, 'HEAD~1') # load a tree into a temporary index
+ >>> merge_index = Index.from_tree(repo, 'HEAD', 'some_branch') # merge two trees
+ >>> merge_index.write("merged_index")
+
+Handling Remotes
+****************
+
+Remotes are used as alias for a foreign repository to ease pushing to and fetching
+from them.
+
+ >>> test_remote = repo.create_remote('test', 'git@server:repo.git')
+ >>> repo.delete_remote(test_remote) # create and delete remotes
+ >>> origin = repo.remotes.origin # get default remote by name
+ >>> origin.refs # local remote references
+ >>> o = origin.rename('new_origin') # rename remotes
+ >>> o.fetch() # fetch, pull and push from and to the remote
+ >>> o.pull()
+ >>> o.push()
+
+You can easily access configuration information for a remote by accessing options
+as if they where attributes.
+
+ >>> o.url
+ 'git@server:dummy_repo.git'
+
+Change configuration for a specific remote only
+ >>> o.config_writer.set("url", "other_url")
+
+Obtaining Diff Information
+**************************
+
+Diffs can generally be obtained by Subclasses of ``Diffable`` as they provide
+the ``diff`` method. This operation yields a DiffIndex allowing you to easily access
+diff information about paths.
+
+Diffs can be made between Index and Trees, Index and the working tree, trees and
+trees as well as trees and the working copy. If commits are involved, their tree
+will be used implicitly.
+
+ >>> hcommit = repo.head.commit
+ >>> idiff = hcommit.diff() # diff tree against index
+ >>> tdiff = hcommit.diff('HEAD~1') # diff tree against previous tree
+ >>> wdiff = hcommit.diff(None) # diff tree against working tree
+
+ >>> index = repo.index
+ >>> index.diff() # diff index against itself yielding empty diff
+ >>> index.diff(None) # diff index against working copy
+ >>> index.diff('HEAD') # diff index against current HEAD tree
+
+The item returned is a DiffIndex which is essentially a list of Diff objects. It
+provides additional filtering to find what you might be looking for
+
+ >>> for diff_added in wdiff.iter_change_type('A'): do_something(diff_added)
+
+Switching Branches
+******************
+To switch between branches, you effectively need to point your HEAD to the new branch
+head and reset your index and working copy to match. A simple manual way to do it
+is the following one.
+
+ >>> repo.head.reference = repo.heads.other_branch
+ >>> repo.head.reset(index=True, working_tree=True
+
+The previous approach would brutally overwrite the user's changes in the working copy
+and index though and is less sophisticated than a git-checkout for instance which
+generally prevents you from destroying your work. Use the safer approach as follows:
+
+ >>> repo.heads.master.checkout() # checkout the branch using git-checkout
+ >>> repo.heads.other_branch.checkout()
+
+Using git directly
+******************
+In case you are missing functionality as it has not been wrapped, you may conveniently
+use the git command directly. It is owned by each repository instance.
+
+ >>> git = repo.git
+ >>> git.checkout('head', b="my_new_branch") # default command
+ >>> git.for_each_ref() # '-' becomes '_' when calling it
+
+The return value will by default be a string of the standard output channel produced
+by the command.
+
+Keyword arguments translate to short and long keyword arguments on the commandline.
+The special notion ``git.command(flag=True)`` will create a flag without value like
+``command --flag``.
+
+If ``None`` is found in the arguments, it will be dropped silently. Lists and tuples
+passed as arguments will be unpacked to individual arguments. Objects are converted
+to strings using the str(...) function.
+
+And even more ...
+*****************
-The Blob object
-***************
-
-A blob represents a file. Trees often contain blobs.
-
- >>> blob = tree['urls.py']
- <git.Blob "b19574431a073333ea09346eafd64e7b1908ef49">
-
-A blob has certain attributes.
-
- >>> blob.name
- 'urls.py'
-
- >>> blob.mode
- '100644'
-
- >>> blob.mime_type
- 'text/x-python'
-
- >>> blob.size
- 415
-
-You can get the data of a blob as a string.
-
- >>> blob.data
- "from django.conf.urls.defaults import *\nfrom django.conf..."
-
-You can also get a blob directly from the repo if you know its name.
-
- >>> repo.blob("b19574431a073333ea09346eafd64e7b1908ef49")
- <git.Blob "b19574431a073333ea09346eafd64e7b1908ef49">
-
-What Else?
-**********
+There is more functionality in there, like the ability to archive repositories, get stats
+and logs, blame, and probably a few other things that were not mentioned here.
-There is more stuff in there, like the ability to tar or gzip repos, stats,
-log, blame, and probably a few other things. Additionally calls to the git
-instance are handled through a ``__getattr__`` construct, which makes
-available any git commands directly, with a nice conversion of Python dicts
-to command line parameters.
+Check the unit tests for an in-depth introduction on how each function is supposed to be used.
-Check the unit tests, they're pretty exhaustive.