| Commit message (Collapse) | Author | Age | Files | Lines |
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
With this,
git pickaxe -L '/--progress/,+20' v1.4.0 -- pack-objects.c
gives you 20 lines starting from the first occurrence of
'--progress' in pack-objects, digging from v1.4.0 version.
You can also say
git pickaxe -L '/--progress/,-5' v1.4.0 -- pack-objects.c
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
With this change, you can specify the beginning and the ending
line of the range you wish to inspect with pattern matching.
For example, these are equivalent with the git.git sources:
git pickaxe -L 7,21 v1.4.0 -- commit.c
git pickaxe -L '/^struct sort_node/,/^}/' v1.4.0 -- commit.c
git pickaxe -L '7,/^}/' v1.4.0 -- commit.c
git pickaxe -L '/^struct sort_node/,21' v1.4.0 -- commit.c
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It turns out that pickaxe reads the same blob repeatedly while
blame can reuse the blob already read for the parent when
handling a child commit when it's parent's turn to pass its
blame to the grandparent. Have a cache in the origin structure
to keep the blob there, which will be garbage collected when the
origin loses the last reference to it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| | |
When we introduced the cached origin per commit, we gave up proper
garbage collecting because it meant that commits hold onto their
cached copy. There is no need to do so.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The reason to do this is the same as in the previous change for
line copy detection within the same file (-M).
Also this fixes -C and -C -C (aka find-copies-harder) logic; in
this application we are not interested in the similarity
matching diffcore-rename makes, because we are only interested
in scanning files that were modified, or in the case of -C -C,
scanning all files in the parent and we want to do that
ourselves.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Otherwise we would miss copied lines that are contained in the
parts before or after the part that we find after splitting the
blame_entry (i.e. split[0] and split[2]).
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
If more than one parents in an Octopus merge have the same
origin, ignore later ones because it would not make any
difference in the outcome.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| | |
The idea is that we are interested in renaming into only one path, so
we do not care about renames that happen elsewhere.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We forgot to add prefix to the given path.
[jc: interestingly enough, Jeff King had the same idea after I
pushed mine out to "pu", and his patch was cleaner, so I dropped
mine.]
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is a shorthand for "<rev> --not <rev>^@", i.e. "include
this commit but exclude any of its parents".
When a new file $F is introduced by revision $R, this notation
can be used to find a copy-and-paste from existing file in the
parents of that revision without annotating the ancestry of the
lines that were copied from:
git pickaxe -f -C $R^! -- $F
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Depending on how bushy the commit DAG is, this saves calls to
the internal diff-tree for fork-point commits. For example,
annotating Makefile in the kernel repository saves about a third
of such diff-tree calls.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When a merge adds a new file from the second parent, the
earlier code tried to find renames in the first parent before
noticing that the vertion from the second parent was added
without modification.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| | |
When compiled for debugging, make sure that refcnt sanity check
code detects underflows in origin reference counting.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This makes "git-pickaxe -C master -- revision.c" to finish with
proper refcounts for all origins. I am reasonably happy with
it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The command rejects -L1,10 as an invalid line range specifier
and I got frustrated enough by it, so this makes it allow both
forms of input.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The origin structure is allocated for each commit and path while
the code traverse down it is copied into different blame entries.
To avoid leaks, try refcounting them.
This still seems to leak, which I haven't tracked down fully yet.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When assigning blames for code movements across file boundaries,
we used to iterate over blame entries (i.e. groups of lines to
be blamed) in the outer loop and compared each entry with paths
in the parent commit in an inner loop. This meant that we
opened the blob data from each path number of times.
Reorganize the loop so that we read the same path only once, and
compare it against all relevant blame entries.
This should perform better, but seems to give mixed results,
though.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After finding out which path in the parent to scan to pass
blames, using get_tree_entry() to extract the blob information
again was quite wasteful, since diff-tree already gave us that
information. Separate the function to create an origin out as
get_origin().
You'll never know what is more efficient unless you try and/or
think hard. I somehow thought that extracting one known path
out of commit's tree is cheaper than running a diff-tree for the
current path between the commit and its parent, but it is not
the case. In real, non-toy projects, most commits do not touch
the path you are interested in, and if the path is a few levels
away from the toplevel, whole-subdirectory comparison logic
diff-tree allows us to skip opening lower subdirectories.
This commit rewrites find_origin() function to use a single-path
diff-tree to see if the parent has the same blob as the current
suspect, which is cheaper than extracting the blob information
using get_tree_entry() and comparing it with what the current
suspect has. This shaves about 6% overhead when annotating
kernel/sched.c in the Linux kernel repository on my machine.
The saving rises to 25% for arch/i386/kernel/Makefile.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It used to be that we can compare the address of the origin
structure to determine if they are the same because they are
always registered with scoreboard. After introduction of the
loop to try finding the best split, that is not true anymore.
The current code has rather serious leaks with origin structure,
but more importantly it gets confused when two origins that
points at the same commit and same path.
We might eventually have to refcount and gc origin, but let's
fix the correctness issue first.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| | |
We need the commit buffer data while generating the final result,
but until then we do not need them.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This adds scoring logic to blame_entry to prevent blames on very
trivial chunks (e.g. lots of empty lines, indent followed by a
closing brace) from being passed down to unrelated lines in the
parent.
The current heuristics are quite simple and may need to be
tweaked later, but we need to start somewhere.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Instead of comparing number of lines matched, look at the
matched characters and count alnums, so that we do not pass
blame on not-so-interesting lines, such as an empty line and
a line that is indentation followed by a closing brace.
Add an option --score-debug to show the score of each
blame_entry while we cook this further on the "next" branch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| | |
We would want to be able to refer to the end of the file as
"the beginning of Nth line" for a file that is N lines long.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| | |
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This completes the initial round of git-pickaxe. In addition to
the detection of line movements we already have, this finds new
lines that were created by moving or cutting-and-pasting lines
from different files in the parent.
With this,
git pickaxe -f -n -C v1.4.0 -- revision.c
finds that a major part of that file actually came from
rev-list.c when Linus split the latter at commit ae563642 and
blames them to earlier commits that touch rev-list.c.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This makes pickaxe more intelligent than the classic blame.
A typical example is a change that moves one static C function
from lower part of the file to upper part of the same file,
because you added a new caller in the middle.
The versions in the parent and the child would look like this:
parent child
A static foo() {
B ...
C }
D A
E B
F C
G D
static foo() { ... call foo();
... E
} F
H G
H
With the classic blame algorithm, we can blame lines A B C D E F
G and H to the parent. The child is guilty of introducing the
line "... call foo();", and the blame is placed on the child.
However, the classic blame algorithm fails to notice that the
implementation of foo() at the top of the file is not new, and
moved from the lower part of the parent.
This commit introduces detection of such line movements, and
correctly blames the lines that were simply moved in the file to
the parent.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently it does what git-blame does, but only faster.
More importantly, its internal structure is designed to support
content movement (aka cut-and-paste) more easily by allowing
more than one paths to be taken from the same commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This adds documentation for --progress and --all-progress, remove a
duplicate --progress handling and make usage string more readable.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* jc/read-tree:
t6022: ignoring untracked files by merge-recursive when they do not matter
merge-recursive: adjust to loosened "working file clobbered" check
merge-recursive: make a few functions static.
merge-recursive: use abbreviated commit object name.
merge: loosen overcautious "working file will be lost" check.
|
| | |
| | |
| | |
| | | |
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The three-way merge by git-read-tree does not complain about
presense of the file in the working tree that is involved in a
merge when the merge result needs to be determined by the
caller. Adjust merge-recursive so that it makes sure that an
untracked file is not touched when the merge decides the path
should not be included in the final result.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The three-way merge complained unconditionally when a path that
does not exist in the index is involved in a merge when it
existed in the working tree. If we are merging an old version
that had that path tracked, but the path is not tracked anymore,
and if we are merging that old version in, the result will be
that the path is not tracked. In that case we should not
complain.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
* np/index-pack:
remove .keep pack lock files when done with refs update
have index-pack create .keep file more carefully
improve fetch-pack's handling of kept packs
git-fetch can use both --thin and --keep with fetch-pack now
Teach receive-pack how to keep pack files based on object count.
Allow pack header preprocessing before unpack-objects/index-pack.
Remove unused variable in receive-pack.
Revert "send-pack --keep: do not explode into loose objects on the receiving end."
missing small substitution
Teach git-index-pack how to keep a pack file.
Only repack active packs by skipping over kept packs.
Allow short pack names to git-pack-objects --unpacked=.
send-pack --keep: do not explode into loose objects on the receiving end.
index-pack: minor fixes to comment and function name
enhance clone and fetch -k experience
mimic unpack-objects when --stdin is used with index-pack
add progress status to index-pack
make index-pack able to complete thin packs.
enable index-pack streaming capability
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This makes both git-fetch and git-push (fetch-pack and receive-pack)
safe against a possible race with aparallel git-repack -a -d that could
prune the new pack while it is not yet referenced, and remove the .keep
file after refs have been updated.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
If by chance we receive a pack which content (list of objects) matches
another pack that we already have, and if that pack is marked with a
.keep file, then we should not overwrite it.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Since functions in fetch-clone.c were only used from fetch-pack.c,
its content has been merged with fetch-pack.c. This allows for better
coupling of features with much simpler implementations.
One new thing is that the (abscence of) --thin also enforce it on
index-pack now, such that index-pack will abort if a thin pack was
_not_ asked for.
The -k or --keep, when provided twice, now causes the fetched pack
to be left as a kept pack just like receive-pack currently does.
Eventually this will be used to close a race against concurrent
repacking.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Since keeping a pushed pack or exploding it into loose objects
should be a local repository decision this teaches receive-pack
to decide if it should call unpack-objects or index-pack --stdin
--fix-thin based on the setting of receive.unpackLimit and the
number of objects contained in the received pack.
If the number of objects (hdr_entries) in the received pack is
below the value of receive.unpackLimit (which is 5000 by default)
then we unpack-objects as we have in the past.
If the hdr_entries >= receive.unpackLimit then we call index-pack and
ask it to include our pid and hostname in the .keep file to make it
easier to identify why a given pack has been kept in the repository.
Currently this leaves every received pack as a kept pack. We really
don't want that as received packs will tend to be small. Instead we
want to delete the .keep file automatically after all refs have
been updated. That is being left as room for future improvement.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Some applications which invoke unpack-objects or index-pack --stdin
may want to examine the pack header to determine the number of
objects contained in the pack and use that value to determine which
executable to invoke to handle the rest of the pack stream.
However if the caller consumes the pack header from the input stream
then its no longer available for unpack-objects or index-pack --stdin,
both of which need the version and object count to process the stream.
This change introduces --pack_header=ver,cnt as a command line option
that the caller can supply to indicate it has already consumed the
pack header and what version and object count were found in that
header. As this option is only meant for low level applications
such as receive-pack we are not documenting it at this time.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* master: (90 commits)
gitweb: Better support for non-CSS aware web browsers
gitweb: Output also empty patches in "commitdiff" view
gitweb: Use git-for-each-ref to generate list of heads and/or tags
for-each-ref: "creator" and "creatordate" fields
Add --global option to git-repo-config.
pack-refs: Store the full name of the ref even when packing only tags.
git-clone documentation didn't mention --origin as equivalent of -o
Minor grammar fixes for git-diff-index.txt
link_temp_to_file: call adjust_shared_perm() only when we created the directory
Remove uneccessarily similar printf() from print_ref_list() in builtin-branch
pack-objects doesn't create random pack names
branch: work in subdirectories.
gitweb: Use 's' regexp modifier to secure against filenames with LF
gitweb: Secure against commit-ish/tree-ish with the same name as path
gitweb: esc_html() author in blame
git-svnimport: support for partial imports
link_temp_to_file: don't leave the path truncated on adjust_shared_perm failure
Move deny_non_fast_forwards handling completely into receive-pack.
revision traversal: --unpacked does not limit commit list anymore.
Continue traversal when rev-list --unpacked finds a packed commit.
...
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
* sp/keep-pack: (29 commits)
Remove unused variable in receive-pack.
Teach git-index-pack how to keep a pack file.
Only repack active packs by skipping over kept packs.
Allow short pack names to git-pack-objects --unpacked=.
git-send-email: Read the default SMTP server from the GIT config file
git-send-email: Document support for local sendmail instead of SMTP server
Swap the porcelain and plumbing commands in the git man page
Mention that pull can work locally in the synopsis
gitweb: Add "next" link to commitdiff view
gitweb: Move git_get_last_activity subroutine earlier
Documentation: fix git-format-patch mark-up and link it from git.txt
Documentation: Update information about <format> in git-for-each-ref
Bash completion support for aliases
gitweb: Fix up bogus $stylesheet declarations
tests: merge-recursive is usable without Python
gitweb: Check git base URLs before generating URL from it
Documentation: add git in /etc/services.
Documentation: add upload-archive service to git-daemon.
git-cherry: document limit and add diagram
diff-format.txt: Correct information about pathnames quoting in patch format
...
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
We aren't using this return code variable for anything so lets
just get rid of it to keep this section of code clean.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
To prevent a race condition between `index-pack --stdin` and
`repack -a -d` where the repack deletes the newly created pack
file before any refs are updated to reference objects contained
within it we mark the pack file as one that should be kept. This
removes it from the list of packs that `repack -a -d` will consider
for removal.
Callers such as `receive-pack` which want to invoke `index-pack`
should use this new --keep option to prevent the newly created pack
and index file pair from being deleted before they have finished any
related ref updates. Only after all ref updates have been finished
should the associated .keep file be removed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | |\ \ \ \ |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
During `git repack -a -d` only repack objects which are loose or
which reside in an active (a non-kept) pack. This allows the user
to keep large packs as-is without continuous repacking and can be
very helpful on large repositories. It should also help us resolve
a race condition between `git repack -a -d` and the new pack store
functionality in `git-receive-pack`.
Kept packs are those which have a corresponding .keep file in
$GIT_OBJECT_DIRECTORY/pack. That is pack-X.pack will be kept
(not repacked and not deleted) if pack-X.keep exists in the same
directory when `git repack -a -d` starts.
Currently this feature is not documented and there is no user
interface to keep an existing pack.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
This allows us to pass just the file name of a pack rather than
the complete path when we want pack-objects to consider its
contents as though they were loose objects. This can be helpful
if $GIT_OBJECT_DIRECTORY contains shell metacharacters which make
it cumbersome to pass complete paths safely in a shell script.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|