| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each window of an svndiff0-format delta includes a section for novel
text to be copied to the postimage (in the order it appears in the
window, possibly interspersed with other data).
Slurp in this data when encountering it. It is not actually necessary
to do so --- it would be just as easy to copy from delta to output
as part of interpreting the relevant instructions --- but this way,
the code that interprets svndiff0 instructions can proceed very
quickly because it does not require I/O.
Subversion's svndiff0 parser rejects deltas that do not consume all
the novel text that was provided. Omit that check for now so we can
test the new functionality right away, rather than waiting to learn
instructions that consume data.
Do check for truncated data sections. Subversion's parser rejects
deltas that end in the middle of a declared novel-text section, so it
should be safe for us to reject them, too.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The source view offset heading each svndiff0 window represents a
number of bytes past the beginning of the preimage. Together with the
source view length, it dictates to the delta applier what portion of
the preimage instructions will refer to. Read that portion right away
using the sliding window code.
Maybe some day we will use mmap to read data more lazily.
Subversion's implementation tolerates source view offsets pointing
past the end of the preimage file but we do not, for simplicity.
This does not teach the delta applier to read instructions or copy
data from the source view. Deltas that could produce nonempty output
will still be rejected.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each window in a subversion delta (svndiff0-format file) starts with a
window header, consisting of five integers with variable-length
representation:
source view offset
source view length
output length
instructions length
auxiliary data length
Parse it. The result is not usable for deltas with nonempty postimage
yet; in fact, this only adds support for deltas without any
instructions or auxiliary data. This is a good place to stop, though,
since that little support lets us add some simple passing tests
concerning error handling to the test suite.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A delta in the subversion delta (svndiff0) format consists of the
magic bytes SVN\0 followed by a sequence of windows of a certain well
specified format (starting with five integers).
Add an svndiff0_apply function and test-svn-fe -d commandline tool to
parse such a delta in the special case of not including any windows.
Later patches will add features to turn this into a fully functional
delta applier for svn-fe to use to parse the streams produced by
"svnrdump dump" and "svnadmin dump --deltas".
The content of symlinks starts with the word "link " in Subversion's
worldview, so we need to be able to prepend that text to input for the
sake of delta application. So initialization of the input state of
the delta preimage is left to the calling program, giving callers a
chance to seed the buffer with text of their choice.
Improved-by: Ramkumar Ramachandra <artagnon@gmail.com>
Improved-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
buffer_read_binary is a thin wrapper around fread, but its signature
is wrong:
- fread can fill an arbitrary in-memory buffer. buffer_read_binary
is limited to buffers whose size is representable by a 32-bit
integer.
- The result from fread is the number of bytes actually read.
buffer_read_binary only reports the number of bytes read by
incrementing sb->len by that amount and returns void.
Fix both: let buffer_read_binary accept a size_t instead of uint32_t
for the number of bytes to read and as a convenience return the number
of bytes actually read.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each section of a Subversion-format delta only requires examining (and
keeping in random-access memory) a small portion of the preimage. At
any moment, this portion starts at a certain file offset and has a
well-defined length, and as the delta is applied, the portion advances
from the beginning to the end of the preimage. Add a move_window
function to keep track of this view into the preimage.
You can use it like this:
buffer_init(f, NULL);
struct sliding_view window = SLIDING_VIEW_INIT(f);
move_window(&window, 3, 7); /* (1) */
move_window(&window, 5, 5); /* (2) */
move_window(&window, 12, 2); /* (3) */
strbuf_release(&window.buf);
buffer_deinit(f);
The data structure is called sliding_view instead of _window to
prevent confusion with svndiff0 Windows.
In this example, (1) reads 10 bytes and discards the first 3;
(2) discards the first 2, which are not needed any more; and (3) skips
2 bytes and reads 2 new bytes to work with.
When move_window returns, the file position indicator is at position
window->off + window->width and the data from positions window->off to
the current file position are stored in window->buf.
This function performs only sequential access from the input file and
never seeks, so it can be safely used on pipes and sockets.
On end-of-file, move_window silently reads less than the caller
requested. On other errors, it prints a message and returns -1.
Helped-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As the svn import infrastructure evolves, it's getting to be a pain to
tell by eye what files were added or removed from a dependency line
like
VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o \
vcs-svn/repo_tree.o vcs-svn/fast_export.o vcs-svn/svndump.o
So use a style with one entry per line instead, like the existing
BUILTIN_OBJS:
# protect against environment
VCSSVN_OBJS =
...
VCSSVN_OBJS += vcs-svn/string_pool.o
VCSSVN_OBJS += vcs-svn/line_buffer.o
...
which is readable on its own and produces nice, clear diffs.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gcc -m32 correctly warns:
vcs-svn/fast_export.c: In function 'fast_export_commit':
vcs-svn/fast_export.c:54:2: warning: format '%llu' expects
argument of type 'long long unsigned int', but argument 2
has type 'unsigned int' [-Wformat]
Fix it.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pass the log message by strbuf instead of as a C-style string and use
fwrite instead of printf to write it to fast-import so embedded '\0'
bytes can be preserved.
Currently "git log" doesn't show the embedded NULs but "git cat-file
commit" can.
While at it, stop including system headers from repo_tree.h. git
source files need to include git-compat-util.h (or cache.h or
builtin.h) sooner to ensure the appropriate feature test macros are
defined.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
| |
Use strbuf_swap when storing the svn:log and svn:author properties, so
pointers to rather than the contents of buffers get copied. The main
effect should be to make the code a little easier to read.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
| |
All previous users of buffer_read_string have already been converted
to use the more intuitive buffer_read_binary, so remove the old API to
avoid some confusion.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
svn-fe errors out on revision 59151 of the ASF repository:
fatal: invalid dump: unexpected end of file
The proximate cause is a property with an embedded NUL character.
Previously such anomalies were ignored but commit c9d1c8ba
(2010-12-28) introduced a check strlen(val) == len to avoid reading
uninitialized data when a property list ends early and unfortunately
this test does not distinguish between "foo" followed by EOF and the
string "foo\0bar\0baz".
Fix it by using buffer_read_binary to read to a strbuf and checking
the actual length read. Most consumers of properties still use
C-style strings, so in practice an author or log message with embedded
NULs will be truncated, but a least this way svn-fe won't error out
(fixing the regression).
Reported-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* db/length-as-hash:
vcs-svn: use strchr to find RFC822 delimiter
vcs-svn: implement perfect hash for top-level keys
vcs-svn: implement perfect hash for node-prop keys
Conflicts:
vcs-svn/svndump.c
|
| |
| |
| |
| |
| |
| |
| |
| | |
This is a small optimisation (4% reduction in user time) but is the
largest artifact within the parsing portion of svndump.c
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Instead of interning property names and comparing their string_pool
keys, look them up in a table by string length, which should be about
as fast.
Another small step towards removing dependence on string_pool
altogether.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Instead of interning property names and comparing their string_pool
keys, look them up in a table by string length, which should be about
as fast.
This is a small step towards removing dependence on string_pool.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Use strbufs and strings instead of interned strings for values of rev,
dump, and node fields that happen to be strings. After this change,
the only remaining string_pool use is for paths in the repo_tree API
and internals.
Functional change: treat an empty author, UUID, or URL as none at all.
So for example, in repos where the first revision has an empty
svn:author property, the first rev will be treated as by "nobody"
rather than by a person with empty name and email address created by
prepending an @ sign to the repository UUID.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|/
|
|
|
|
|
|
|
|
|
|
| |
obj_pool is overkill for this application: all that is needed is a
buffer that can resize from rev to rev to accomodate differently-sized
strings. In the spirit of commit deadcef4 (2010-11-06), use a strbuf
instead.
This is a small step towards removing dependence on obj_pool.h.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
| |
Catch input errors and exit early enough to print a reasonable
diagnosis based on errno.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently buffer_copy_bytes does not report to its caller whether
it encountered an early end of file.
Add a return value representing the number of bytes read (but not
the number of bytes copied). This way all three unusual conditions
can be distinguished: input error with buffer_ferror, output error
with ferror(outfile), early end of input by checking the return
value.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there is no way to detect when input ended if it ended
early during buffer_skip_bytes. Tell the calling program how many
bytes were actually skipped for easier debugging.
Existing callers will still ignore early EOF.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move from uint32_t to off_t as the fundamental unit of length used by
the line_buffer library. Performance would get worse if anything but
I think it's worth it for support of deltas that need to skip large
pieces (> 4 GiB).
Exception: buffer_read_string still takes a uint32_t, since it keeps
its result in an in-core obj_pool.
Callers still have to be updated to take advantage of this.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The line_buffer library silently flags input errors until
buffer_deinit time; unfortunately, by that point usually errno is
invalid. Expose the error flag so callers can check for and
report errors early for easy debugging.
some_error_prone_operation(...);
if (buffer_ferror(buf))
return error("input error: %s", strerror(errno));
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Restrict the repo_tree API to functions that are actually needed.
- decouple reading the mode and content of dirents from other
operations.
- remove repo_modify_path. It is only used to read the mode from
dirents.
- remove the ability to use repo_read_mode on a missing path. The
existing code only errors out in that case, anyway.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
svn-fe processes each commit in two stages: first decide on the
correct content for all paths and export the relevant blobs, then
export a commit with the result.
But we can keep less state and simplify svn-fe a great deal by
exporting the commit in one step: use 'inline' blobs for each path and
remember nothing. This way, the repo_tree structure could be
eliminated, and we would get support for incremental imports 'for
free'.
Reorganize handle_node along these lines. This is just a code
cleanup; the changes in repo_tree and handle_revision will come later.
[db: backported to apply without text delta support]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The repo_tree structure remembers, for each path in each revision, a
mode (regular file, executable, symlink, or directory) and content
(blob mark or directory structure). Maintaining a second copy of all
this information when it's already in the target repository is
wasteful, it does not persist between svn-fe invocations, and most
importantly, there is no convenient way to transfer it from one
machine to another. So it would be nice to get rid of it.
As a first step, let's change the repo_tree API to match fast-import's
read commands more closely. Currently to read the mode for a path,
one uses
repo_modify_path(path, new_mode, new_content);
which changes the mode and content as a side effect. There is no
function to read the content at a path; add one.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|
|
|
|
|
|
|
|
| |
The dereference() function to peel a tree-ish and find the underlying
tree expects arithmetic to (void *) to work on byte addresses. We
should be reading the text of objects through a char * anyway.
Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* git://github.com/gitster/git:
vcs-svn: Allow change nodes for root of tree (/)
vcs-svn: Implement Prop-delta handling
vcs-svn: Sharpen parsing of property lines
vcs-svn: Split off function for handling of individual properties
vcs-svn: Make source easier to read on small screens
vcs-svn: More dump format sanity checks
vcs-svn: Reject path nodes without Node-action
vcs-svn: Delay read of per-path properties
vcs-svn: Combine repo_replace and repo_modify functions
vcs-svn: Replace = Delete + Add
vcs-svn: handle_node: Handle deletion case early
vcs-svn: Use mark to indicate nodes with included text
vcs-svn: Unclutter handle_node by introducing have_props var
vcs-svn: Eliminate node_ctx.mark global
vcs-svn: Eliminate node_ctx.srcRev global
vcs-svn: Check for errors from open()
vcs-svn: Allow simple v3 dumps (no deltas yet)
Conflicts:
t/t9010-svn-fe.sh
vcs-svn/svndump.c
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It is not uncommon for a svn repository to include change records for
properties at the top level of the tracked tree:
Node-path:
Node-kind: dir
Node-action: change
Prop-delta: true
Prop-content-length: 43
Content-length: 43
K 10
svn:ignore
V 11
build-area
PROPS-END
Unfortunately a recent svn-fe change (vcs-svn: More dump format sanity
checks, 2010-11-19) causes such nodes to be rejected with the error
message
fatal: invalid dump: path to be modified is missing
The repo_tree module does not keep a dirent for the root of the tree.
Add a block to the dump parser to take care of this case.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The rules for what file is used as delta source for each file are not
documented in dump-load-format.txt. Luckily, the Apache Software
Foundation repository has rich enough examples to figure out most of
the rules:
Node-action: replace implies the empty property set and empty text as
preimage for deltas. Otherwise, if a copyfrom source is given, that
node is the preimage for deltas. Lastly, if none of the above applies
and the node path exists in the current revision, then that version
forms the basis.
[jn: refactored, with tests]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Prepare to add a new type of property line (the 'D' line) to
handle property deltas.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The handle_property function is the part of read_props that would be
interesting for most people: semantics of properties rather than the
algorithm for parsing them.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Remove some newlines from handle_node() that are not needed for
clarity.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Node-action: change is not appropriate when switching between file and
directory or adding a new file. Current svn-fe silently accepts such
nodes and the resulting tree has missing files in the "changed when
meant to add" case.
Node-action: add requires some content (text or directory); there is
no such thing as an "intent to add" node in svn dumps. Current svn-fe
accepts such contentless adds but produces an invalid fast-import
stream that refers to nonexistent mark :0 in response.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It would be better to flag such errors and let the import proceed
anyway, but for now it is simpler not to worry about recovery
from such weird cases.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The mode for each file in an svn-format dump is kept in the properties
section. The properties section is read as soon as possible to allow
the correct mode to be filled in when registering the file with the
repo_tree lib.
To support nodes with a missing properties section, svn-fe determines
the mode in three stages:
- The kind (directory or file) of the node is read from the dump and
used to make an initial estimate (040000 or 100644).
- Properties are read in and allowed to override this for symlinks
and executables.
- If there is no properties section, the mode from the previous
content of the path is left alone, overriding the above
considerations.
This is a bit of a mess, and worse, it would get even more complicated
once we start to support property deltas. If we could only register
the file with a provisional value for mode and then change it later
when properties say so, the procedure would be much simpler.
... oh, right, we can.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There are two functions to change the staged content for a path in the
svn importer's active commit: repo_replace, which changes the text and
returns the mode, and repo_modify, which changes the text and mode and
returns nothing.
Worse, there are more subtle differences:
- A mark of 0 passed to repo_modify means "use the existing content".
repo_replace uses it as mark :0 and produces a corrupt stream.
- When passed a path that is not part of the active commit,
repo_replace returns without doing anything. repo_modify
transparently adds a new directory entry.
Get rid of both and introduce a new function with the best features of
both: repo_modify_path modifies the mode, content, or both for a path,
depending on which arguments are zero. If no such dirent already
exists, it does nothing and reports the error by returning 0.
Otherwise, the return value is the resulting mode.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Simplify by reducing the "Node-action: replace" case to "Node-action:
add". This way, the main part of handle_node() only has to deal with
"add" and "change" nodes.
Functional change: replacing a symlink or executable without setting
properties will reset the mode.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Take care of "Node-action: delete" as soon as possible, so we can stop
worrying about that case in the rest of the function.
Functional change: catch deletion nodes with features that would not
apply to them (text, properties, or origin data) and error out for
those cases.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Allocate a mark if needed as soon as possible so later code can use
"if (mark)" to check if this node has text attached rather than
explicitly checking for Text-content-length.
While at it, reject directory nodes with text attached; the presence
of such a node would indicate a bug in the dump generator or svn-fe's
understanding. In the long term, it would be nice to be able to
continue parsing and save the error for later, but for now it is
simpler to error out right away.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It is possible for a path node in an SVN-format dump file to leave out
the properties section. svn-fe handles this by carrying over the
properties (in particular, file type) from the old version of that
node.
To support this, handle_node tests several times whether a
Prop-content-length field is present. Ancient Subversion actually
leaves out the Prop-content-length field even for nodes with
properties, so that's not quite the right check. Besides, this detail
of mechanism is distracting when the question at hand is instead what
content the new node should have.
So introduce a local have_props variable. The semantics are the same
as before; the adaptations to support ancient streams that leave out
the prop-content-length can wait until someone needs them.
Signed-off-by: Jonathan Nieder <jrnieer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The mark variable is only used in handle_node(). Its life is
very short and simple: first, a new mark number is allocated if
this node has text attached, then that mark is recorded in the
in-core tree being built up, and lastly the mark is communicated
to fast-import in the stream along with the associated text.
A new reader may worry about interaction with other code, especially
since mark is not initialized to zero in handle_node() itself.
Disperse such worries by making it local. No functional change
intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The srcRev variable is only used in handle_node(); its purpose
is to hold the old mode for a path, to only be used if properties
are not being changed. Narrow its scope to make its meaningful
lifetime more obvious.
No functional change intended. Add some tests as a sanity-check
for the simplest case (no renames).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
test-svn-fe segfaults when passed a bogus path. Simplify debugging by
exiting with a meaningful error message instead.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Since the dumpfile version 1 days, the Subversion dump format
gained some new fields:
- a unique identifier for the repository (version 2 format)
- whether the text and properties for a node should be
interpreted as deltas
- checksums for a delta's preimage
- SHA-1 sums as alternatives to the existing MD5 checksums for
copy source and the payload (delta).
For now what is relevant to us is the Text-delta and Prop-delta
fields, since not noticing these causes a dump file to be
misinterpreted (see the previous commit).
[jn: with tests]
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It can sometimes be useful to write information temporarily to file,
to read back later. These functions allow a program to use the
line_buffer facilities when doing so.
It works like this:
1. find a unique filename with buffer_tmpfile_init.
2. rewind with buffer_tmpfile_rewind. This returns a stdio
handle for writing.
3. when finished writing, declare so with
buffer_tmpfile_prepare_to_read. The return value indicates
how many bytes were written.
4. read whatever portion of the file is needed.
5. if finished, remove the temporary file with buffer_deinit.
otherwise, go back to step 2,
The svn support would use this to buffer the postimage from delta
application until the length is known and fast-import can receive
the resulting blob.
Based-on-patch-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
| |
| |
| |
| |
| | |
Based-on-patch-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
buffer_read_char can be used in place of buffer_read_string(1) to
avoid consuming valuable static buffer space. The delta applier will
use this to read variable-length integers one byte at a time.
Underneath, it is fgetc, wrapped so the line_buffer library can
maintain its role as gatekeeper of input.
Later it might be worth checking if fgetc_unlocked is faster ---
most line_buffer functions are not thread-safe anyway.
Helpd-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
buffer_read_string works well for non line-oriented input except for
one problem: it does not tell the caller how many bytes were actually
written. This means that unless one is very careful about checking
for errors (and eof) the calling program cannot tell the difference
between the string "foo" followed by an early end of file and the
string "foo\0bar\0baz".
So introduce a variant that reports the length, too, a thinner wrapper
around strbuf_fread. Its result is written to a strbuf so the caller
does not need to keep track of the number of bytes read.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
POSIX makes the behavior of read(2) from a pipe fairly clear: a read
from an empty pipe will block until there is data available and any
other read will not block, prefering to return a partial result.
Likewise, fread(3) and fgets(3) are clearly specified to act as
though implemented by calling fgetc(3) in a simple loop. But the
buffering behavior of fgetc is less clear.
Luckily, no sane platform is going to implement fgetc by calling the
equivalent of read(2) more than once. fgetc has to be able to
return without filling its buffer to preserve errno when errors are
encountered anyway. So let's assume the simpler behavior (trust) but
add some tests to catch insane platforms that violate that when they
come (verify).
First check that fread can handle a 0-length read from an empty fifo.
Because open(O_RDONLY) blocks until the writing end is open, open the
writing end of the fifo in advance in a subshell.
Next try short inputs from a pipe that is not filled all the way.
Lastly (two tests) try very large inputs from a pipe that will not fit
in the relevant buffers. The first of these tests reads a little
more than 8192 bytes, which is BUFSIZ (the size of stdio's buffers)
on this Linux machine. The second reads a little over 64 KiB (the
pipe capacity on Linux) and is not run unless requested by setting
the GIT_REMOTE_SVN_TEST_BIG_FILES environment variable.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
|