<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/python-packages/gitpython.git/git/diff.py, branch 2.0.7</title>
<subtitle>github.com: gitpython-developers/GitPython.git
</subtitle>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/'/>
<entry>
<title>Store raw path bytes in Diff instances</title>
<updated>2016-06-14T21:09:22+00:00</updated>
<author>
<name>Vincent Driessen</name>
<email>me@nvie.com</email>
</author>
<published>2016-06-14T20:44:11+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=3ee291c469fc7ea6065ed22f344ed3f2792aa2ca'/>
<id>3ee291c469fc7ea6065ed22f344ed3f2792aa2ca</id>
<content type='text'>
Previously, the following fields on Diff instances were assumed to be
passed in as unicode strings:

  - `a_path`
  - `b_path`
  - `rename_from`
  - `rename_to`

However, since Git natively records paths as bytes, these may
potentially not have a valid unicode representation.

This patch changes the Diff instance to instead take the following
equivalent fields that should be raw bytes instead:

  - `a_rawpath`
  - `b_rawpath`
  - `raw_rename_from`
  - `raw_rename_to`

NOTE ON BACKWARD COMPATIBILITY:
The original `a_path`, `b_path`, etc. fields are still available as
properties (rather than slots).  These properties now dynamically decode
the raw bytes into a unicode string (performing the potentially
destructive operation of replacing invalid unicode chars by "�"'s).
This means that all code using Diffs should remain backward compatible.
The only exception is when people would manually construct Diff
instances by calling the constructor directly, in which case they should
now pass in bytes rather than unicode strings.

See also the discussion on
https://github.com/gitpython-developers/GitPython/pull/467
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously, the following fields on Diff instances were assumed to be
passed in as unicode strings:

  - `a_path`
  - `b_path`
  - `rename_from`
  - `rename_to`

However, since Git natively records paths as bytes, these may
potentially not have a valid unicode representation.

This patch changes the Diff instance to instead take the following
equivalent fields that should be raw bytes instead:

  - `a_rawpath`
  - `b_rawpath`
  - `raw_rename_from`
  - `raw_rename_to`

NOTE ON BACKWARD COMPATIBILITY:
The original `a_path`, `b_path`, etc. fields are still available as
properties (rather than slots).  These properties now dynamically decode
the raw bytes into a unicode string (performing the potentially
destructive operation of replacing invalid unicode chars by "�"'s).
This means that all code using Diffs should remain backward compatible.
The only exception is when people would manually construct Diff
instances by calling the constructor directly, in which case they should
now pass in bytes rather than unicode strings.

See also the discussion on
https://github.com/gitpython-developers/GitPython/pull/467
</pre>
</div>
</content>
</entry>
<entry>
<title>Don't choke on (legitimately) invalidly encoded Unicode paths</title>
<updated>2016-06-06T10:16:11+00:00</updated>
<author>
<name>Vincent Driessen</name>
<email>me@nvie.com</email>
</author>
<published>2016-06-06T10:13:37+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=200d3c6cb436097eaee7c951a0c9921bfcb75c7f'/>
<id>200d3c6cb436097eaee7c951a0c9921bfcb75c7f</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix bug in diff parser output</title>
<updated>2016-05-30T13:44:46+00:00</updated>
<author>
<name>Vincent Driessen</name>
<email>me@nvie.com</email>
</author>
<published>2016-05-30T13:26:23+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=1faf84f8eb760b003ad2be81432443bf443b82e6'/>
<id>1faf84f8eb760b003ad2be81432443bf443b82e6</id>
<content type='text'>
The diff --patch parser was missing some edge case where Git would
encode non-ASCII chars in path names as octals, but these weren't
decoded properly.

    \360\237\222\251.txt

Decoded via utf-8, that will return:

    💩.txt
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The diff --patch parser was missing some edge case where Git would
encode non-ASCII chars in path names as octals, but these weren't
decoded properly.

    \360\237\222\251.txt

Decoded via utf-8, that will return:

    💩.txt
</pre>
</div>
</content>
</entry>
<entry>
<title>Deprecate Diffable.rename for .renamed_file</title>
<updated>2016-05-19T10:43:19+00:00</updated>
<author>
<name>Sebastian Thiel</name>
<email>byronimo@gmail.com</email>
</author>
<published>2016-05-19T10:41:16+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=4bcc4d55baef64825b4163c6fb8526a2744b4a86'/>
<id>4bcc4d55baef64825b4163c6fb8526a2744b4a86</id>
<content type='text'>
Fixes #426
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes #426
</pre>
</div>
</content>
</entry>
<entry>
<title>Python 3 compat fixes</title>
<updated>2016-04-19T22:12:55+00:00</updated>
<author>
<name>Vincent Driessen</name>
<email>me@nvie.com</email>
</author>
<published>2016-04-19T22:07:22+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=19099f9ce7e8d6cb1f5cafae318859be8c082ca2'/>
<id>19099f9ce7e8d6cb1f5cafae318859be8c082ca2</id>
<content type='text'>
Specifically "string_escape" does not exist as an encoding anymore.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Specifically "string_escape" does not exist as an encoding anymore.
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix diff patch parser for paths with unsafe chars</title>
<updated>2016-04-19T21:46:54+00:00</updated>
<author>
<name>Vincent Driessen</name>
<email>me@nvie.com</email>
</author>
<published>2016-04-19T21:41:01+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=7fbc182e6d4636f67f44e5893dee3dcedfa90e04'/>
<id>7fbc182e6d4636f67f44e5893dee3dcedfa90e04</id>
<content type='text'>
This specifically covers the cases where unsafe chars occur in path
names, and git-diff -p will escape those.

From the git-diff-tree manpage:

&gt; 3. TAB, LF, double quote and backslash characters in pathnames are
&gt;    represented as \t, \n, \" and \\, respectively. If there is need
&gt;    for such substitution then the whole pathname is put in double
&gt;    quotes.

This patch checks whether or not this has happened and will unescape
those paths accordingly.

One thing to note here is that, depending on the position in the patch
format, those paths may be prefixed with an a/ or b/.  I've specifically
made sure to never interpret a path that actually starts with a/ or b/
incorrectly.

Example of that subtlety below.  Here, the actual file path is
"b/normal".  On the diff file that gets encoded as "b/b/normal".

     diff --git a/b/normal b/b/normal
     new file mode 100644
     index 0000000000000000000000000000000000000000..eaf5f7510320b6a327fb308379de2f94d8859a54
     --- /dev/null
     +++ b/b/normal
     @@ -0,0 +1 @@
     +dummy content

Here, we prefer the "---" and "+++" lines' values.  Note that these
paths start with a/ or b/.  The only exception is the value "/dev/null",
which is handled as a special case.

Suppose now the file gets moved "b/moved", the output of that diff would
then be this:

     diff --git a/b/normal b/b/moved
     similarity index 100%
     rename from b/normal
     rename to b/moved

We prefer the "rename" lines' values in this case (the "diff" line is
always a last resort).  Take note that those lines are not prefixed with
a/ or b/, but the ones in the "diff" line are (just like the ones in
"---" or "+++" lines).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This specifically covers the cases where unsafe chars occur in path
names, and git-diff -p will escape those.

From the git-diff-tree manpage:

&gt; 3. TAB, LF, double quote and backslash characters in pathnames are
&gt;    represented as \t, \n, \" and \\, respectively. If there is need
&gt;    for such substitution then the whole pathname is put in double
&gt;    quotes.

This patch checks whether or not this has happened and will unescape
those paths accordingly.

One thing to note here is that, depending on the position in the patch
format, those paths may be prefixed with an a/ or b/.  I've specifically
made sure to never interpret a path that actually starts with a/ or b/
incorrectly.

Example of that subtlety below.  Here, the actual file path is
"b/normal".  On the diff file that gets encoded as "b/b/normal".

     diff --git a/b/normal b/b/normal
     new file mode 100644
     index 0000000000000000000000000000000000000000..eaf5f7510320b6a327fb308379de2f94d8859a54
     --- /dev/null
     +++ b/b/normal
     @@ -0,0 +1 @@
     +dummy content

Here, we prefer the "---" and "+++" lines' values.  Note that these
paths start with a/ or b/.  The only exception is the value "/dev/null",
which is handled as a special case.

Suppose now the file gets moved "b/moved", the output of that diff would
then be this:

     diff --git a/b/normal b/b/moved
     similarity index 100%
     rename from b/normal
     rename to b/moved

We prefer the "rename" lines' values in this case (the "diff" line is
always a last resort).  Take note that those lines are not prefixed with
a/ or b/, but the ones in the "diff" line are (just like the ones in
"---" or "+++" lines).
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix order of regex parts</title>
<updated>2016-04-19T19:46:16+00:00</updated>
<author>
<name>Vincent Driessen</name>
<email>me@nvie.com</email>
</author>
<published>2016-04-15T06:32:45+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=1445b59bb41c4b1a94b7cb0ec6864c98de63814b'/>
<id>1445b59bb41c4b1a94b7cb0ec6864c98de63814b</id>
<content type='text'>
When both old/new mode and rename from/to lines are found, they will
appear in different order.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When both old/new mode and rename from/to lines are found, they will
appear in different order.
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix regex</title>
<updated>2016-04-19T19:46:16+00:00</updated>
<author>
<name>Vincent Driessen</name>
<email>me@nvie.com</email>
</author>
<published>2016-04-14T23:39:58+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=cdf7c5aca2201cf9dfc3cd301264da4ea352b737'/>
<id>cdf7c5aca2201cf9dfc3cd301264da4ea352b737</id>
<content type='text'>
This makes sure we're not matching a \n here by accident.  It's now
almost the same as the original that used \S+, except that spaces are
not eaten at the end of the string (for files that end in a space).
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This makes sure we're not matching a \n here by accident.  It's now
almost the same as the original that used \S+, except that spaces are
not eaten at the end of the string (for files that end in a space).
</pre>
</div>
</content>
</entry>
<entry>
<title>Make diff patch parsing more reliable</title>
<updated>2016-04-19T19:45:18+00:00</updated>
<author>
<name>Vincent Driessen</name>
<email>me@nvie.com</email>
</author>
<published>2016-04-14T19:27:39+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=e77128e5344ce7d84302facc08d17c3151037ec3'/>
<id>e77128e5344ce7d84302facc08d17c3151037ec3</id>
<content type='text'>
The a_path and b_path cannot reliably be read from the first diff line
as it's ambiguous.  From the git-diff manpage:

  &gt; The a/ and b/ filenames are the same unless rename/copy is involved.
  &gt; Especially, **even for a creation or a deletion**, /dev/null is not
  &gt; used in place of the a/ or b/ filenames.

This patch changes the a_path and b_path detection to read it from the
more reliable locations further down the diff headers.  Two use cases
are fixed by this:

  - As the man page snippet above states, for new/deleted files the a
    or b path will now be properly None.
  - File names with spaces in it are now properly parsed.

Working on this patch, I realized the --- and +++ lines really belong to
the diff header, not the diff contents.  This means that when parsing
the patch format, the --- and +++ will now be swallowed, and not end up
anymore as part of the diff contents.  The diff contents now always
start with an @@ line.

This may be a breaking change for some users that rely on this
behaviour.  However, those users could now access that information more
reliably via the normal Diff properties a_path and b_path now.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The a_path and b_path cannot reliably be read from the first diff line
as it's ambiguous.  From the git-diff manpage:

  &gt; The a/ and b/ filenames are the same unless rename/copy is involved.
  &gt; Especially, **even for a creation or a deletion**, /dev/null is not
  &gt; used in place of the a/ or b/ filenames.

This patch changes the a_path and b_path detection to read it from the
more reliable locations further down the diff headers.  Two use cases
are fixed by this:

  - As the man page snippet above states, for new/deleted files the a
    or b path will now be properly None.
  - File names with spaces in it are now properly parsed.

Working on this patch, I realized the --- and +++ lines really belong to
the diff header, not the diff contents.  This means that when parsing
the patch format, the --- and +++ will now be swallowed, and not end up
anymore as part of the diff contents.  The diff contents now always
start with an @@ line.

This may be a breaking change for some users that rely on this
behaviour.  However, those users could now access that information more
reliably via the normal Diff properties a_path and b_path now.
</pre>
</div>
</content>
</entry>
<entry>
<title>Perform diff-tree recursively to have the same output as diff</title>
<updated>2016-04-14T15:31:17+00:00</updated>
<author>
<name>Vincent Driessen</name>
<email>me@nvie.com</email>
</author>
<published>2016-04-14T15:31:17+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/gitpython.git/commit/?id=28afef550371cd506db2045cbdd89d895bec5091'/>
<id>28afef550371cd506db2045cbdd89d895bec5091</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
</feed>
