summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'jh/notes' (early part)Junio C Hamano2009-11-2020-10/+1408
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'jh/notes' (early part): Add selftests verifying concatenation of multiple notes for the same commit Refactor notes code to concatenate multiple notes annotating the same object Add selftests verifying that we can parse notes trees with various fanouts Teach the notes lookup code to parse notes trees with various fanout schemes Teach notes code to free its internal data structures on request Add '%N'-format for pretty-printing commit notes Add flags to get_commit_notes() to control the format of the note string t3302-notes-index-expensive: Speed up create_repo() fast-import: Add support for importing commit notes Teach "-m <msg>" and "-F <file>" to "git notes edit" Add an expensive test for git-notes Speed up git notes lookup Add a script to edit/inspect notes Introduce commit notes Conflicts: .gitignore Documentation/pretty-formats.txt pretty.c
| * Add selftests verifying concatenation of multiple notes for the same commitJohan Herland2009-10-191-0/+84
| | | | | | | | | | | | | | | | Also verify that multiple references to the _same_ note blob are _not_ concatenated. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Refactor notes code to concatenate multiple notes annotating the same objectJohan Herland2009-10-191-82/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, having multiple notes referring to the same commit from various locations in the notes tree is strongly discouraged, since only one of those notes will be parsed and shown. This patch teaches the notes code to _concatenate_ multiple notes that annotate the same commit. Notes are concatenated by creating a new blob object containing the concatenation of the notes in question, and replacing them with the concatenated note in the internal notes tree structure. Getting the concatenation right requires being more proactive in unpacking subtree entries in the internal notes tree structure, so that we don't return a note prematurely (i.e. before having found all other notes that annotate the same object). As such, this patch may incur a small performance penalty. Suggested-by: Sam Vilain <sam@vilain.net> Re-suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Add selftests verifying that we can parse notes trees with various fanoutsJohan Herland2009-10-191-0/+104
| | | | | | | | | | Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Teach the notes lookup code to parse notes trees with various fanout schemesJohan Herland2009-10-191-69/+248
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The semantics used when parsing notes trees (with regards to fanout subtrees) follow Dscho's proposal fairly closely: - No concatenation/merging of notes is performed. If there are several notes objects referencing a given commit, only one of those objects are used. - If a notes object for a given commit is present in the "root" notes tree, no subtrees are consulted; the object in the root tree is used directly. - If there are more than one subtree that prefix-matches the given commit, only the subtree with the longest matching prefix is consulted. This means that if the given commit is e.g. "deadbeef", and the notes tree have subtrees "de" and "dead", then the following paths in the notes tree are searched: "deadbeef", "dead/beef". Note that "de/adbeef" is NOT searched. - Fanout directories (subtrees) must references a whole number of bytes from the SHA1 sum they subdivide. E.g. subtrees "dead" and "de" are acceptable; "d" and "dea" are not. - Multiple levels of fanout are allowed. All the above rules apply recursively. E.g. "de/adbeef" is preferred over "de/adbe/ef", etc. This patch changes the in-memory datastructure for holding parsed notes: Instead of holding all note (and subtree) entries in a hash table, a simple 16-tree structure is used instead. The tree structure consists of 16-arrays as internal nodes, and note/subtree entries as leaf nodes. The tree is traversed by indexing subsequent nibbles of the search key until a leaf node is encountered. If a subtree entry is encountered while searching for a note, the subtree is unpacked into the 16-tree structure, and the search continues into that subtree. The new algorithm performs significantly better in the cases where only a fraction of the notes need to be looked up (this is assumed to be the common case for notes lookup). The new code even performs marginally better in the worst case (where _all_ the notes are looked up). In addition to this, comes the massive performance win associated with organizing the notes tree according to some fanout scheme. Even a simple 2/38 fanout scheme is dramatically quicker to traverse (going from tens of seconds to sub-second runtimes). As for memory usage, the new code is marginally better than the old code in the worst case, but in the case of looking up only some notes from a notes tree with proper fanout, the new code uses only a small fraction of the memory needed to hold the entire notes tree. However, there is one casualty of this patch. The old notes lookup code was able to parse notes that were associated with non-SHA1s (e.g. refs). The new code requires the referenced object to be named by a SHA1 sum. Still, this is not considered a major setback, since the notes infrastructure was not originally intended to annotate objects outside the Git object database. Cc: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Teach notes code to free its internal data structures on requestJohan Herland2009-10-192-0/+10
| | | | | | | | | | | | | | | | | | | | | | There's no need to be rude to memory-concious callers... This patch has been improved by the following contributions: - Junio C Hamano: avoid old-style declaration Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Add '%N'-format for pretty-printing commit notesJohannes Schindelin2009-10-192-0/+5
| | | | | | | | | | Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Add flags to get_commit_notes() to control the format of the note stringJohan Herland2009-10-193-5/+11
| | | | | | | | | | | | | | | | | | | | | | This patch adds the following flags to get_commit_notes() for adjusting the format of the produced note string: - NOTES_SHOW_HEADER: Print "Notes:" line before the notes contents - NOTES_INDENT: Indent notes contents by 4 spaces Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * t3302-notes-index-expensive: Speed up create_repo()Johan Herland2009-10-191-27/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | Creating repos with 10/100/1000/10000 commits and notes takes a lot of time. However, using git-fast-import to do the job is a lot more efficient than using plumbing commands to do the same. This patch decreases the overall run-time of this test on my machine from ~3 to ~1 minutes. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * fast-import: Add support for importing commit notesJohan Herland2009-10-193-10/+289
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand is similar to 'filemodify', except that no mode is supplied (all notes have mode 0644), and the path is set to the hex SHA1 of the given "comittish". This enables fast import of note objects along with their associated commits, since the notes can now be named using the mark references of their corresponding commits. The patch also includes a test case of the added functionality. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Teach "-m <msg>" and "-F <file>" to "git notes edit"Johan Herland2009-10-193-9/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "-m" and "-F" options are already the established method (in both git-commit and git-tag) to specify a commit/tag message without invoking the editor. This patch teaches "git notes edit" to respect the same options for specifying a notes message without invoking the editor. Multiple "-m" and/or "-F" options are concatenated as separate paragraphs. The patch also updates the "git notes" documentation and adds selftests for the new functionality. Unfortunately, the added selftests include a couple of lines with trailing whitespace (without these the test will fail). This may cause git to warn about "whitespace errors". This patch has been improved by the following contributions: - Thomas Rast: fix trailing whitespace in t3301 Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Add an expensive test for git-notesJohannes Schindelin2009-10-191-0/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | git-notes have the potential of being pretty expensive, so test with a lot of commits. A lot. So to make things cheaper, you have to opt-in explicitely, by setting the environment variable GIT_NOTES_TIMING_TESTS. This patch has been improved by the following contributions: - Junio C Hamano: tests: fix "export var=val" Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Speed up git notes lookupJohannes Schindelin2009-10-191-10/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid looking up each and every commit in the notes ref's tree object, which is very expensive, speed things up by slurping the tree object's contents into a hash_map. The idea for the hashmap singleton is from David Reiss, initial benchmarking by Jeff King. Note: the implementation allows for arbitrary entries in the notes tree object, ignoring those that do not reference a valid object. This allows you to annotate arbitrary branches, or objects. This patch has been improved by the following contributions: - Junio C Hamano: fixed an obvious error in initialize_hash_map() Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Add a script to edit/inspect notesJohannes Schindelin2009-10-196-0/+236
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The script 'git notes' allows you to edit and show commit notes, by calling either git notes show <commit> or git notes edit <commit> This patch has been improved by the following contributions: - Tor Arne Vestbø: fix printing of multi-line notes - Michael J Gruber: test and handle empty notes gracefully - Thomas Rast: - only clean up message file when editing - use GIT_EDITOR and core.editor over VISUAL/EDITOR - t3301: fix confusing quoting in test for valid notes ref - t3301: use test_must_fail instead of ! - refuse to edit notes outside refs/notes/ - Junio C Hamano: tests: fix "export var=val" - Christian Couder: documentation: fix 'linkgit' macro in "git-notes.txt" - Johan Herland: minor cleanup and bugfixing in git-notes.sh (v2) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Tor Arne Vestbø <tavestbo@trolltech.com> Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * Introduce commit notesJohannes Schindelin2009-10-199-0/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit notes are blobs which are shown together with the commit message. These blobs are taken from the notes ref, which you can configure by the config variable core.notesRef, which in turn can be overridden by the environment variable GIT_NOTES_REF. The notes ref is a branch which contains "files" whose names are the names of the corresponding commits (i.e. the SHA-1). The rationale for putting this information into a ref is this: we want to be able to fetch and possibly union-merge the notes, maybe even look at the date when a note was introduced, and we want to store them efficiently together with the other objects. This patch has been improved by the following contributions: - Thomas Rast: fix core.notesRef documentation - Tor Arne Vestbø: fix printing of multi-line notes - Alex Riesen: Using char array instead of char pointer costs less BSS - Johan Herland: Plug leak when msg is good, but msglen or type causes return Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Tor Arne Vestbø <tavestbo@trolltech.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> get_commit_notes(): Plug memory leak when 'if' triggers, but not because of read_sha1_file() failure
* | Merge branch 'sp/smart-http'Junio C Hamano2009-11-2035-283/+2937
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * sp/smart-http: (37 commits) http-backend: Let gcc check the format of more printf-type functions. http-backend: Fix access beyond end of string. http-backend: Fix bad treatment of uintmax_t in Content-Length t5551-http-fetch: Work around broken Accept header in libcurl t5551-http-fetch: Work around some libcurl versions http-backend: Protect GIT_PROJECT_ROOT from /../ requests Git-aware CGI to provide dumb HTTP transport http-backend: Test configuration options http-backend: Use http.getanyfile to disable dumb HTTP serving test smart http fetch and push http tests: use /dumb/ URL prefix set httpd port before sourcing lib-httpd t5540-http-push: remove redundant fetches Smart HTTP fetch: gzip requests Smart fetch over HTTP: client side Smart push over HTTP: client side Discover refs via smart HTTP server when available http-backend: more explict LocationMatch http-backend: add example for gitweb on same URL http-backend: use mod_alias instead of mod_rewrite ... Conflicts: .gitignore remote-curl.c
| * | http-backend: Let gcc check the format of more printf-type functions.Tarmigan Casebolt2009-11-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already have these checks in many printf-type functions that have prototypes which are in header files. Add these same checks to static functions in http-backend.c Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: Fix access beyond end of string.Tarmigan Casebolt2009-11-151-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | Found with valgrind while looking for Content-Length corruption in smart http. Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: Fix bad treatment of uintmax_t in Content-LengthShawn O. Pearce2009-11-131-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our Content-Length needs to report an off_t, which could be larger precision than size_t on this system (e.g. 32 bit binary built with 64 bit large file support). We also shouldn't be passing a size_t parameter to printf when we've used PRIuMAX as the format specifier. Fix both issues by using uintmax_t for the hdr_int() routine, allowing strbuf's size_t to automatically upcast, and off_t to always fit. Also fixed the copy loop we use inside of send_local_file(), we never actually updated the size variable so we might as well not use it. Reported-by: Tarmigan <tarmigan+git@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | t5551-http-fetch: Work around broken Accept header in libcurlShawn O. Pearce2009-11-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unfortunately at least one version of libcurl has a bug causing it to include "Accept: */*" in the same POST request where we have already asked for "Accept: application/x-git-upload-pack-response". This is a bug in libcurl, not Git, or our test vector. The application has explicitly asked the server for a single content type, but libcurl has mistakenly also told the server the client application will accept */*, which is any content type. Based on the libcurl change log, this "Accept: */*" header bug may have been fixed in version 7.18.1 released March 30, 2008: http://curl.haxx.se/changes.html#7_18_1 Rather than require users to upgrade libcurl we change the test vector to trim this line out of the 2nd request. Reported-by: Tarmigan <tarmigan+git@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | t5551-http-fetch: Work around some libcurl versionsShawn O. Pearce2009-11-091-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some versions of libcurl report their output when GIT_CURL_VERBOSE is set differently than other versions do. At least one variant (version unknown but likely pre-7.18.1) reports the POST payload to stderr, and omits the blank line after each HTTP request/response. We clip these lines out of the stderr output now before doing the compare, so we aren't surprised by this trivial difference. Reported-by: Tarmigan <tarmigan+git@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: Protect GIT_PROJECT_ROOT from /../ requestsShawn O. Pearce2009-11-095-48/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eons ago HPA taught git-daemon how to protect itself from /../ attacks, which Junio brought back into service in d79374c7b58d ("daemon.c and path.enter_repo(): revamp path validation"). I did not carry this into git-http-backend as originally we relied only upon PATH_TRANSLATED, and assumed the HTTP server had done its access control checks to validate the resolved path was within a directory permitting access from the remote client. This would usually be sufficient to protect a server from requests for its /etc/passwd file by http://host/smart/../etc/passwd sorts of URLs. However in 917adc036086 Mark Lodato added GIT_PROJECT_ROOT as an additional method of configuring the CGI. When this environment variable is used the web server does not generate the final access path and therefore may blindly pass through "/../etc/passwd" in PATH_INFO under the assumption that "/../" might have special meaning to the invoked CGI. Instead of permitting these sorts of malformed path requests, we now reject them back at the client, with an error message for the server log. This matches git-daemon behavior. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Git-aware CGI to provide dumb HTTP transportShawn O. Pearce2009-11-091-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | http-backend: Fix symbol clash on AIX 5.3 Mike says: > > +static void send_file(const char *the_type, const char *name) > > +{ > > I think a symbol clash here is responsible for a build breakage in > next on AIX 5.3: > > CC http-backend.o > http-backend.c:213: error: conflicting types for `send_file' > /usr/include/sys/socket.h:676: error: previous declaration of `send_file' > gmake: *** [http-backend.o] Error 1 So we rename the function send_local_file(). Reported-by: Mike Ralphson <mike.ralphson@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: Test configuration optionsShawn O. Pearce2009-11-041-0/+229
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test the major configuration settings which control access to the repository: http.getanyfile http.uploadpack http.receivepack Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: Use http.getanyfile to disable dumb HTTP servingShawn O. Pearce2009-11-042-6/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | test smart http fetch and pushShawn O. Pearce2009-11-045-2/+219
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http tests: use /dumb/ URL prefixShawn O. Pearce2009-11-043-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To clarify what part of the HTTP transprot is being tested we change the URLs used by existing tests to include /dumb/ at the start, indicating they use the non-Git aware code paths. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | set httpd port before sourcing lib-httpdClemens Buchacher2009-11-041-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | If LIB_HTTPD_PORT is not set already, lib-httpd will set it to the default 8111. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | t5540-http-push: remove redundant fetchesTay Ray Chuan2009-11-041-2/+0
| | | | | | | | | | | | | | | | | | Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Smart HTTP fetch: gzip requestsShawn O. Pearce2009-11-041-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The upload-pack requests are mostly plain text and they compress rather well. Deflating them with Content-Encoding: gzip can easily drop the size of the request by 50%, reducing the amount of data to transfer as we negotiate the common commits. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Smart fetch over HTTP: client sideShawn O. Pearce2009-11-043-22/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The git-remote-curl backend detects if the remote server supports the git-upload-pack service, and if so, runs git-fetch-pack locally in a pipe to generate the want/have commands. The advertisements from the server that were obtained during the discovery are passed into git-fetch-pack before the POST request starts, permitting server capability discovery and enablement. Common objects that are discovered are appended onto the request as have lines and are sent again on the next request. This allows the remote side to reinitialize its in-memory list of common objects during the next request. Because all requests are relatively short, below git-remote-curl's 1 MiB buffer limit, requests will use the standard Content-Length header and be valid HTTP/1.0 POST requests. This makes the fetch client more tolerant of proxy servers which don't support HTTP/1.1 or the chunked transfer encoding. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Smart push over HTTP: client sideShawn O. Pearce2009-11-048-15/+374
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The git-remote-curl backend detects if the remote server supports the git-receive-pack service, and if so, runs git-send-pack in a pipe to dump the command and pack data as a single POST request. The advertisements from the server that were obtained during the discovery are passed into git-send-pack before the POST request starts. This permits git-send-pack to operate largely unmodified. For smaller packs (those under 1 MiB) a HTTP/1.0 POST with a Content-Length is used, permitting interaction with any server. The 1 MiB limit is arbitrary, but is sufficent to fit most deltas created by human authors against text sources with the occasional small binary file (e.g. few KiB icon image). The configuration option http.postBuffer can be used to increase (or shink) this buffer if the default is not sufficient. For larger packs which cannot be spooled entirely into the helper's memory space (due to http.postBuffer being too small), the POST request requires HTTP/1.1 and sets "Transfer-Encoding: chunked". This permits the client to upload an unknown amount of data in one HTTP transaction without needing to pregenerate the entire pack file locally. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Discover refs via smart HTTP server when availableShawn O. Pearce2009-11-041-17/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of loading the cached info/refs, try to use the smart HTTP version when the server supports it. Since the smart variant is actually the pkt-line stream from the start of either upload-pack or receive-pack we need to parse these through get_remote_heads, which requires a background thread to feed its pipe. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: more explict LocationMatchMark Lodato2009-11-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the git-http-backend examples, only match git-receive-pack within /git/. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: add example for gitweb on same URLMark Lodato2009-11-041-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the git-http-backend documentation, add an example of how to set up gitweb and git-http-backend on the same URL by using a series of mod_alias commands. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: use mod_alias instead of mod_rewriteMark Lodato2009-11-041-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the git-http-backend documentation, use mod_alias exlusively, instead of using a combination of mod_alias and mod_rewrite. This makes the example slightly shorted and a bit more clear. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: reword some documentationMark Lodato2009-11-041-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clarify some of the git-http-backend documentation, particularly: * In the Description, state that smart/dumb HTTP fetch and smart HTTP push are supported, state that authenticated clients allow push, and remove the note that this is only suited for read-only updates. * At the start of Examples, state explicitly what URL is mapping to what location on disk. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | http-backend: add GIT_PROJECT_ROOT environment varMark Lodato2009-11-042-25/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new environment variable, GIT_PROJECT_ROOT, to override the method of using PATH_TRANSLATED to find the git repository on disk. This makes it much easier to configure the web server, especially when the web server's DocumentRoot does not contain the git repositories, which is the usual case. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Smart fetch and push over HTTP: server sideShawn O. Pearce2009-11-042-4/+359
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Add stateless RPC options to upload-pack, receive-packShawn O. Pearce2009-11-042-10/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When --stateless-rpc is passed as a command line parameter to upload-pack or receive-pack the programs now assume they may perform only a single read-write cycle with stdin and stdout. This fits with the HTTP POST request processing model where a program may read the request, write a response, and must exit. When --advertise-refs is passed as a command line parameter only the initial ref advertisement is output, and the program exits immediately. This fits with the HTTP GET request model, where no request content is received but a response must be produced. HTTP headers and/or environment are not processed here, but instead are assumed to be handled by the program invoking either service backend. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Git-aware CGI to provide dumb HTTP transportShawn O. Pearce2009-11-044-0/+396
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | remote-helpers: return successfully if everything up-to-dateClemens Buchacher2009-10-302-1/+3
| | | | | | | | | | | | | | | | | | Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Move WebDAV HTTP push under remote-curlShawn O. Pearce2009-10-306-54/+287
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The remote helper interface now supports the push capability, which can be used to ask the implementation to push one or more specs to the remote repository. For remote-curl we implement this by calling the existing WebDAV based git-http-push executable. Internally the helper interface uses the push_refs transport hook so that the complexity of the refspec parsing and matching can be reused between remote implementations. When possible however the helper protocol uses source ref name rather than the source SHA-1, thereby allowing the helper to access this name if it is useful. >From Clemens Buchacher <drizzd@aon.at>: update http tests according to remote-curl capabilities o Pushing packed refs is now fixed. o The transport helper fails if refs are already up-to-date. Add a test for that. o The transport helper will notice if refs are already up-to-date. We therefore need to update server info in the unpacked-refs test. o The transport helper will purge deleted branches automatically. o Use a variable ($ORIG_HEAD) instead of full SHA-1 name. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> CC: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | remote-helpers: Support custom transport optionsShawn O. Pearce2009-10-303-2/+198
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some transports, like the native pack transport implemented by fetch-pack, support useful features like depth or include tags. These should be exposed if the underlying helper knows how to use them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | remote-helpers: Fetch more than one ref in a batchShawn O. Pearce2009-10-303-26/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some network protocols (e.g. native git://) are able to fetch more than one ref at a time and reduce the overall transfer cost by combining the requests into a single exchange. Instead of feeding each fetch request one at a time to the helper, feed all of them at once so the helper can decide whether or not it should batch them. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | fetch: Allow transport -v -v -v to set verbosity to 3Shawn O. Pearce2009-10-302-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Helpers might want a higher level of verbosity than just +1 (the porcelain default setting) and +2 (-v -v). Expand the field to allow verbosity in the range -1..3. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | remote-curl: Refactor walker initializationShawn O. Pearce2009-10-301-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We will need the walker, url and remote in other functions as the code grows larger to support smart HTTP. Extract this out into a set of globals we can easily reference once configured. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Add multi_ack_detailed capability to fetch-pack/upload-packShawn O. Pearce2009-10-302-22/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When multi_ack_detailed is enabled the ACK continue messages returned by the remote upload-pack are broken out to describe the different states within the peer. This permits the client to better understand the server's in-memory state. The fetch-pack/upload-pack protocol now looks like: NAK --------------------------------- Always sent in response to "done" if there was no common base selected from the "have" lines (or no have lines were sent). * no multi_ack or multi_ack_detailed: Sent when the client has sent a pkt-line flush ("0000") and the server has not yet found a common base object. * either multi_ack or multi_ack_detailed: Always sent in response to a pkt-line flush. ACK %s ----------------------------------- * no multi_ack or multi_ack_detailed: Sent in response to "have" when the object exists on the remote side and is therefore an object in common between the peers. The argument is the SHA-1 of the common object. * either multi_ack or multi_ack_detailed: Sent in response to "done" if there are common objects. The argument is the last SHA-1 determined to be common. ACK %s continue ----------------------------------- * multi_ack only: Sent in response to "have". The remote side wants the client to consider this object as common, and immediately stop transmitting additional "have" lines for objects that are reachable from it. The reason the client should stop is not given, but is one of the two cases below available under multi_ack_detailed. ACK %s common ----------------------------------- * multi_ack_detailed only: Sent in response to "have". Both sides have this object. Like with "ACK %s continue" above the client should stop sending have lines reachable for objects from the argument. ACK %s ready ----------------------------------- * multi_ack_detailed only: Sent in response to "have". The client should stop transmitting objects which are reachable from the argument, and send "done" soon to get the objects. If the remote side has the specified object, it should first send an "ACK %s common" message prior to sending "ACK %s ready". Clients may still submit additional "have" lines if there are more side branches for the client to explore that might be added to the common set and reduce the number of objects to transfer. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Move "get_ack()" back to fetch-packShawn O. Pearce2009-10-303-22/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 41cb7488 Linus moved this function to connect.c for reuse inside of the git-clone-pack command. That was 2005, but in 2006 Junio retired git-clone-pack in commit efc7fa53. Since then the only caller has been fetch-pack. Since this ACK/NAK exchange is only used by the fetch-pack/upload-pack protocol we should move it back to be a private detail of fetch-pack. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | fetch-pack: Use a strbuf to compose the want listShawn O. Pearce2009-10-303-25/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change is being offered as a refactoring to make later commits in the smart HTTP series easier. By changing the enabled capabilities to be formatted in a strbuf it is easier to add a new capability to the set of supported capabilities. By formatting the want portion of the request into a strbuf and writing it as a whole block we can later decide to hold onto the req_buf (instead of releasing it) to recycle in stateless communications. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>