Merge tag 'upstream/0.9.5' into unstable

Upstream version 0.9.5
author: Jelmer Vernooĳ <jelmer@jelmer.uk> 2016-04-18 17:39:04 +0000
committer: Jelmer Vernooĳ <jelmer@jelmer.uk> 2016-04-18 17:39:04 +0000
commit: 9b953dff18b4df3a5cbc9794a3c4bf1430395e74 (patch)
tree: e86bd99e010c50cefe2a11bfcfe2597c5636c123
parent: 39137a0f21aca2f403680909a1a1ed79774f58e7 (diff)
parent: 3859165aeb1876e47af7826a33b8a16a44e05e75 (diff)
download: python-fastimport-git-9b953dff18b4df3a5cbc9794a3c4bf1430395e74.tar.gz
22 files changed, 1329 insertions, 788 deletions
diff --git a/AUTHORS b/AUTHORS
new file mode 100644
index 0000000..b0e0f23
--- /dev/null
+++ b/AUTHORS
@@ -0,0 +1,6 @@
+Ian Clatworthy wrote bzr-fastimport, which included a lot of generic code for
+parsing and generating fastimport streams. Jelmer Vernooij split out
+into its own separate package (python-fastimport) so it can be used by other
+projects, and is its current maintainer.
+
+Félix Mattrat ported python-fastimport to Python 3.
diff --git a/COPYING b/COPYING
new file mode 100644
index 0000000..d511905
--- /dev/null
+++ b/COPYING
@@ -0,0 +1,339 @@
+		    GNU GENERAL PUBLIC LICENSE
+		       Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+			    Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+		    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+			    NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+		     END OF TERMS AND CONDITIONS
+
+	    How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+    <one line to give the program's name and a brief idea of what it does.>
+    Copyright (C) <year>  <name of author>
+
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 2 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License along
+    with this program; if not, write to the Free Software Foundation, Inc.,
+    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+    Gnomovision version 69, Copyright (C) year name of author
+    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+    This is free software, and you are welcome to redistribute it
+    under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License.  Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary.  Here is a sample; alter the names:
+
+  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+  `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+  <signature of Ty Coon>, 1 April 1989
+  Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
diff --git a/NEWS b/NEWS
new file mode 100644
index 0000000..96a0d02
--- /dev/null
+++ b/NEWS
@@ -0,0 +1,46 @@
+0.9.5	2016-04-18
+
+ * Add python3.5 support. (Félix Mattrat)
+
+0.9.4	2014-07-04
+
+ * Get handlers from class object using getattr() for possible inheritance
+   (Cécile Tonglet)
+
+ * Fix 'check-pypy' by removing use of nonexistant target. (masklinn)
+
+ * Use namedtuple for authorship tuple in Commit.{author,committer}.
+   (masklinn)
+
+0.9.3	2014-03-01
+
+ * Remove unused and untested helper single_plural,
+   invert_dict, invert_dictset, defines_to_dict and
+   binary_stream.
+   (Jelmer Vernooij)
+
+ * Install NEWS and README files.
+
+0.9.2	2012-04-03
+
+ * Remove reftracker and idmapfile, which are bzr-specific.
+   (Jelmer Vernooij, #693507)
+
+ * Cope with invalid timezones like +61800 a little bit better.
+   (Jelmer Vernooij, #959154)
+
+ * Allow non-strict parsing of fastimport streams, when
+   a tagger is missing an email address.
+   (Jelmer Vernooij, #730607)
+
+0.9.1	2012-02-28
+
+ * Update FSF address in headers. (Dan Callaghan, #868800)
+
+ * Support 'done' feature. (Jelmer Vernooij, #942563)
+
+ * Rename tarball for the benefit of pip. (Jelmer Vernooij, #779690)
+
+0.9.0	2011-01-30
+
+ Initial release.
diff --git a/PKG-INFO b/PKG-INFO
index 460a10b..02e349f 100644
--- a/PKG-INFO
+++ b/PKG-INFO
@@ -1,6 +1,6 @@
 Metadata-Version: 1.0
 Name: fastimport
-Version: 0.9.4
+Version: 0.9.5
 Summary: VCS fastimport/fastexport parser
 Home-page: https://launchpad.net/python-fastimport
 Author: Canonical Ltd
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..975b92b
--- /dev/null
+++ b/README.md
@@ -0,0 +1,5 @@
+python-fastimport
+=================
+
+This package provides a parser for and generator of the Git fastimport format.
+(https://www.kernel.org/pub/software/scm/git/docs/git-fast-import.html)
diff --git a/fastimport/__init__.py b/fastimport/__init__.py
index 37265b8..e515366 100644
--- a/fastimport/__init__.py
+++ b/fastimport/__init__.py
@@ -30,4 +30,4 @@ it can be used by other projects.  Use it like so:
    processor.process(parser.parse())
 """
 
-__version__ = (0, 9, 4)
+__version__ = (0, 9, 5)
diff --git a/fastimport/commands.py b/fastimport/commands.py
index d83b905..c8fb25e 100644
--- a/fastimport/commands.py
+++ b/fastimport/commands.py
@@ -18,8 +18,17 @@
 These objects are used by the parser to represent the content of
 a fast-import stream.
 """
+from __future__ import division
 
+import re
 import stat
+import sys
+
+from fastimport.helpers import (
+    newobject as object,
+    utf8_bytes_string,
+    )
+
 
 # There is a bug in git 1.5.4.3 and older by which unquoting a string consumes
 # one extra character. Set this variable to True to work-around it. It only
@@ -32,15 +41,15 @@ GIT_FAST_IMPORT_NEEDS_EXTRA_SPACE_AFTER_QUOTE = False
 
 
 # Lists of command names
-COMMAND_NAMES = ['blob', 'checkpoint', 'commit', 'feature', 'progress',
-    'reset', 'tag']
-FILE_COMMAND_NAMES = ['filemodify', 'filedelete', 'filecopy', 'filerename',
-    'filedeleteall']
+COMMAND_NAMES = [b'blob', b'checkpoint', b'commit', b'feature', b'progress',
+    b'reset', b'tag']
+FILE_COMMAND_NAMES = [b'filemodify', b'filedelete', b'filecopy', b'filerename',
+    b'filedeleteall']
 
 # Feature names
-MULTIPLE_AUTHORS_FEATURE = "multiple-authors"
-COMMIT_PROPERTIES_FEATURE = "commit-properties"
-EMPTY_DIRS_FEATURE = "empty-directories"
+MULTIPLE_AUTHORS_FEATURE = b'multiple-authors'
+COMMIT_PROPERTIES_FEATURE = b'commit-properties'
+EMPTY_DIRS_FEATURE = b'empty-directories'
 FEATURE_NAMES = [
     MULTIPLE_AUTHORS_FEATURE,
     COMMIT_PROPERTIES_FEATURE,
@@ -59,6 +68,17 @@ class ImportCommand(object):
     def __str__(self):
         return repr(self)
 
+    def __repr__(self):
+        if sys.version_info[0] == 2:
+            return self.__bytes__()
+        else:
+            return bytes(self).decode('utf8')
+
+    def __bytes__(self):
+        raise NotImplementedError(
+            'An implementation of __bytes__ is required'
+        )
+
     def dump_str(self, names=None, child_lists=None, verbose=False):
         """Dump fields as a string.
 
@@ -74,13 +94,16 @@ class ImportCommand(object):
         """
         interesting = {}
         if names is None:
-            fields = [k for k in self.__dict__.keys() if not k.startswith('_')]
+            fields = [
+                k for k in list(self.__dict__.keys())
+                if not k.startswith(b'_')
+            ]
         else:
             fields = names
         for field in fields:
             value = self.__dict__.get(field)
             if field in self._binary and value is not None:
-                value = '(...)'
+                value = b'(...)'
             interesting[field] = value
         if verbose:
             return "%s: %s" % (self.__class__.__name__, interesting)
@@ -91,39 +114,39 @@ class ImportCommand(object):
 class BlobCommand(ImportCommand):
 
     def __init__(self, mark, data, lineno=0):
-        ImportCommand.__init__(self, 'blob')
+        ImportCommand.__init__(self, b'blob')
         self.mark = mark
         self.data = data
         self.lineno = lineno
         # Provide a unique id in case the mark is missing
         if mark is None:
-            self.id = '@%d' % lineno
+            self.id = b'@%d' % lineno
         else:
-            self.id = ':' + mark
-        self._binary = ['data']
+            self.id = b':' + mark
+        self._binary = [b'data']
 
-    def __repr__(self):
+    def __bytes__(self):
         if self.mark is None:
-            mark_line = ""
+            mark_line = b''
         else:
-            mark_line = "\nmark :%s" % self.mark
-        return "blob%s\ndata %d\n%s" % (mark_line, len(self.data), self.data)
+            mark_line = b"\nmark :%s" % self.mark
+        return b'blob%s\ndata %d\n%s' % (mark_line, len(self.data), self.data)
 
 
 class CheckpointCommand(ImportCommand):
 
     def __init__(self):
-        ImportCommand.__init__(self, 'checkpoint')
+        ImportCommand.__init__(self, b'checkpoint')
 
-    def __repr__(self):
-        return "checkpoint"
+    def __bytes__(self):
+        return b'checkpoint'
 
 
 class CommitCommand(ImportCommand):
 
     def __init__(self, ref, mark, author, committer, message, from_,
         merges, file_iter, lineno=0, more_authors=None, properties=None):
-        ImportCommand.__init__(self, 'commit')
+        ImportCommand.__init__(self, b'commit')
         self.ref = ref
         self.mark = mark
         self.author = author
@@ -135,74 +158,83 @@ class CommitCommand(ImportCommand):
         self.more_authors = more_authors
         self.properties = properties
         self.lineno = lineno
-        self._binary = ['file_iter']
+        self._binary = [b'file_iter']
         # Provide a unique id in case the mark is missing
         if mark is None:
-            self.id = '@%d' % lineno
+            self.id = b'@%d' % lineno
         else:
-            self.id = ':%s' % mark
+            self.id = b':%s' % mark
 
     def copy(self, **kwargs):
         if not isinstance(self.file_iter, list):
             self.file_iter = list(self.file_iter)
 
-        fields = dict((k, v) for k, v in self.__dict__.iteritems()
-                      if k not in ('id', 'name')
-                      if not k.startswith('_'))
+        fields = dict(
+            (key, value)
+            for key, value in self.__dict__.items()
+            if key not in ('id', 'name')
+            if not key.startswith('_')
+        )
+
         fields.update(kwargs)
+
         return CommitCommand(**fields)
 
-    def __repr__(self):
+    def __bytes__(self):
         return self.to_string(include_file_contents=True)
 
-    def __str__(self):
-        return self.to_string(include_file_contents=False)
 
     def to_string(self, use_features=True, include_file_contents=False):
+        """
+            @todo the name to_string is ambiguous since the method actually
+                returns bytes.
+        """
         if self.mark is None:
-            mark_line = ""
+            mark_line = b''
         else:
-            mark_line = "\nmark :%s" % self.mark
+            mark_line = b'\nmark :%s' % self.mark
         if self.author is None:
-            author_section = ""
+            author_section = b''
         else:
-            author_section = "\nauthor %s" % format_who_when(self.author)
+            author_section = b'\nauthor %s' % format_who_when(self.author)
             if use_features and self.more_authors:
                 for author in self.more_authors:
-                    author_section += "\nauthor %s" % format_who_when(author)
-        committer = "committer %s" % format_who_when(self.committer)
+                    author_section += b'\nauthor %s' % format_who_when(author)
+
+        committer = b'committer %s' % format_who_when(self.committer)
+
         if self.message is None:
-            msg_section = ""
+            msg_section = b''
         else:
             msg = self.message
-            msg_section = "\ndata %d\n%s" % (len(msg), msg)
+            msg_section = b'\ndata %d\n%s' % (len(msg), msg)
         if self.from_ is None:
-            from_line = ""
+            from_line = b''
         else:
-            from_line = "\nfrom %s" % self.from_
+            from_line = b'\nfrom %s' % self.from_
         if self.merges is None:
-            merge_lines = ""
+            merge_lines = b''
         else:
-            merge_lines = "".join(["\nmerge %s" % (m,)
+            merge_lines = b''.join([b'\nmerge %s' % (m,)
                 for m in self.merges])
         if use_features and self.properties:
             property_lines = []
             for name in sorted(self.properties):
                 value = self.properties[name]
-                property_lines.append("\n" + format_property(name, value))
-            properties_section = "".join(property_lines)
+                property_lines.append(b'\n' + format_property(name, value))
+            properties_section = b''.join(property_lines)
         else:
-            properties_section = ""
+            properties_section = b''
         if self.file_iter is None:
-            filecommands = ""
+            filecommands = b''
         else:
             if include_file_contents:
-                format_str = "\n%r"
+                format_str = b'\n%r'
             else:
-                format_str = "\n%s"
-            filecommands = "".join([format_str % (c,)
+                format_str = b'\n%s'
+            filecommands = b''.join([format_str % (c,)
                 for c in self.iter_files()])
-        return "commit %s%s%s\n%s%s%s%s%s%s" % (self.ref, mark_line,
+        return b'commit %s%s%s\n%s%s%s%s%s%s' % (self.ref, mark_line,
             author_section, committer, msg_section, from_line, merge_lines,
             properties_section, filecommands)
 
@@ -215,7 +247,7 @@ class CommitCommand(ImportCommand):
                 child_names = child_lists[f.name]
             except KeyError:
                 continue
-            result.append("\t%s" % f.dump_str(child_names, verbose=verbose))
+            result.append('\t%s' % f.dump_str(child_names, verbose=verbose))
         return '\n'.join(result)
 
     def iter_files(self):
@@ -229,73 +261,73 @@ class CommitCommand(ImportCommand):
 class FeatureCommand(ImportCommand):
 
     def __init__(self, feature_name, value=None, lineno=0):
-        ImportCommand.__init__(self, 'feature')
+        ImportCommand.__init__(self, b'feature')
         self.feature_name = feature_name
         self.value = value
         self.lineno = lineno
 
-    def __repr__(self):
+    def __bytes__(self):
         if self.value is None:
-            value_text = ""
+            value_text = b''
         else:
-            value_text = "=%s" % self.value
-        return "feature %s%s" % (self.feature_name, value_text)
+            value_text = b'=%s' % self.value
+        return b'feature %s%s' % (self.feature_name, value_text)
 
 
 class ProgressCommand(ImportCommand):
 
     def __init__(self, message):
-        ImportCommand.__init__(self, 'progress')
+        ImportCommand.__init__(self, b'progress')
         self.message = message
 
-    def __repr__(self):
-        return "progress %s" % (self.message,)
+    def __bytes__(self):
+        return b'progress %s' % (self.message,)
 
 
 class ResetCommand(ImportCommand):
 
     def __init__(self, ref, from_):
-        ImportCommand.__init__(self, 'reset')
+        ImportCommand.__init__(self, b'reset')
         self.ref = ref
         self.from_ = from_
 
-    def __repr__(self):
+    def __bytes__(self):
         if self.from_ is None:
-            from_line = ""
+            from_line = b''
         else:
             # According to git-fast-import(1), the extra LF is optional here;
             # however, versions of git up to 1.5.4.3 had a bug by which the LF
             # was needed. Always emit it, since it doesn't hurt and maintains
             # compatibility with older versions.
             # http://git.kernel.org/?p=git/git.git;a=commit;h=655e8515f279c01f525745d443f509f97cd805ab
-            from_line = "\nfrom %s\n" % self.from_
-        return "reset %s%s" % (self.ref, from_line)
+            from_line = b'\nfrom %s\n' % self.from_
+        return b'reset %s%s' % (self.ref, from_line)
 
 
 class TagCommand(ImportCommand):
 
     def __init__(self, id, from_, tagger, message):
-        ImportCommand.__init__(self, 'tag')
+        ImportCommand.__init__(self, b'tag')
         self.id = id
         self.from_ = from_
         self.tagger = tagger
         self.message = message
 
-    def __repr__(self):
+    def __bytes__(self):
         if self.from_ is None:
-            from_line = ""
+            from_line = b''
         else:
-            from_line = "\nfrom %s" % self.from_
+            from_line = b'\nfrom %s' % self.from_
         if self.tagger is None:
-            tagger_line = ""
+            tagger_line = b''
         else:
-            tagger_line = "\ntagger %s" % format_who_when(self.tagger)
+            tagger_line = b'\ntagger %s' % format_who_when(self.tagger)
         if self.message is None:
-            msg_section = ""
+            msg_section = b''
         else:
             msg = self.message
-            msg_section = "\ndata %d\n%s" % (len(msg), msg)
-        return "tag %s%s%s%s" % (self.id, from_line, tagger_line, msg_section)
+            msg_section = b'\ndata %d\n%s' % (len(msg), msg)
+        return b'tag %s%s%s%s' % (self.id, from_line, tagger_line, msg_section)
 
 
 class FileCommand(ImportCommand):
@@ -307,66 +339,67 @@ class FileModifyCommand(FileCommand):
 
     def __init__(self, path, mode, dataref, data):
         # Either dataref or data should be null
-        FileCommand.__init__(self, 'filemodify')
+        FileCommand.__init__(self, b'filemodify')
         self.path = check_path(path)
         self.mode = mode
         self.dataref = dataref
         self.data = data
-        self._binary = ['data']
+        self._binary = [b'data']
 
-    def __repr__(self):
+    def __bytes__(self):
         return self.to_string(include_file_contents=True)
 
     def __str__(self):
         return self.to_string(include_file_contents=False)
 
     def _format_mode(self, mode):
-        if mode in (0755, 0100755):
-            return "755"
-        elif mode in (0644, 0100644):
-            return "644"
-        elif mode == 040000:
-            return "040000"
-        elif mode == 0120000:
-            return "120000"
-        elif mode == 0160000:
-            return "160000"
+        if mode in (0o755, 0o100755):
+            return b'755'
+        elif mode in (0o644, 0o100644):
+            return b'644'
+        elif mode == 0o40000:
+            return b'040000'
+        elif mode == 0o120000:
+            return b'120000'
+        elif mode == 0o160000:
+            return b'160000'
         else:
-            raise AssertionError("Unknown mode %o" % mode)
+            raise AssertionError('Unknown mode %o' % mode)
 
     def to_string(self, include_file_contents=False):
-        datastr = ""
+        datastr = b''
         if stat.S_ISDIR(self.mode):
-            dataref = '-'
+            dataref = b'-'
         elif self.dataref is None:
-            dataref = "inline"
+            dataref = b'inline'
             if include_file_contents:
-                datastr = "\ndata %d\n%s" % (len(self.data), self.data)
+                datastr = b'\ndata %d\n%s' % (len(self.data), self.data)
         else:
-            dataref = "%s" % (self.dataref,)
+            dataref = b'%s' % (self.dataref,)
         path = format_path(self.path)
-        return "M %s %s %s%s" % (self._format_mode(self.mode), dataref, path, datastr)
+
+        return b'M %s %s %s%s' % (self._format_mode(self.mode), dataref, path, datastr)
 
 
 class FileDeleteCommand(FileCommand):
 
     def __init__(self, path):
-        FileCommand.__init__(self, 'filedelete')
+        FileCommand.__init__(self, b'filedelete')
         self.path = check_path(path)
 
-    def __repr__(self):
-        return "D %s" % (format_path(self.path),)
+    def __bytes__(self):
+        return b'D %s' % (format_path(self.path),)
 
 
 class FileCopyCommand(FileCommand):
 
     def __init__(self, src_path, dest_path):
-        FileCommand.__init__(self, 'filecopy')
+        FileCommand.__init__(self, b'filecopy')
         self.src_path = check_path(src_path)
         self.dest_path = check_path(dest_path)
 
-    def __repr__(self):
-        return "C %s %s" % (
+    def __bytes__(self):
+        return b'C %s %s' % (
             format_path(self.src_path, quote_spaces=True),
             format_path(self.dest_path))
 
@@ -374,37 +407,38 @@ class FileCopyCommand(FileCommand):
 class FileRenameCommand(FileCommand):
 
     def __init__(self, old_path, new_path):
-        FileCommand.__init__(self, 'filerename')
+        FileCommand.__init__(self, b'filerename')
         self.old_path = check_path(old_path)
         self.new_path = check_path(new_path)
 
-    def __repr__(self):
-        return "R %s %s" % (
+    def __bytes__(self):
+        return b'R %s %s' % (
             format_path(self.old_path, quote_spaces=True),
-            format_path(self.new_path))
+            format_path(self.new_path)
+        )
 
 
 class FileDeleteAllCommand(FileCommand):
 
     def __init__(self):
-        FileCommand.__init__(self, 'filedeleteall')
+        FileCommand.__init__(self, b'filedeleteall')
+
+    def __bytes__(self):
+        return b'deleteall'
 
-    def __repr__(self):
-        return "deleteall"
 
 class NoteModifyCommand(FileCommand):
 
     def __init__(self, from_, data):
-        super(NoteModifyCommand, self).__init__('notemodify')
+        super(NoteModifyCommand, self).__init__(b'notemodify')
         self.from_ = from_
         self.data = data
         self._binary = ['data']
 
-    def __str__(self):
-        return "N inline :%s" % self.from_
-
-    def __repr__(self):
-        return "%s\ndata %d\n%s" % (self, len(self.data), self.data)
+    def __bytes__(self):
+        return b'N inline :%s\ndata %d\n%s' % (
+            self.from_, len(self.data), self.data
+        )
 
 
 def check_path(path):
@@ -413,24 +447,28 @@ def check_path(path):
     :return: the path if all is OK
     :raise ValueError: if the path is illegal
     """
-    if path is None or path == '' or path[0] == "/":
+    if path is None or path == b'' or path.startswith(b'/'):
         raise ValueError("illegal path '%s'" % path)
-    if type(path) != str:
+
+    if (
+        (sys.version_info[0] >= 3 and not isinstance(path, bytes)) and
+        (sys.version_info[0] == 2 and not isinstance(path, str))
+    ):
         raise TypeError("illegale type for path '%r'" % path)
+
     return path
 
 
 def format_path(p, quote_spaces=False):
     """Format a path in utf8, quoting it if necessary."""
-    if '\n' in p:
-        import re
-        p = re.sub('\n', '\\n', p)
+    if b'\n' in p:
+        p = re.sub(b'\n', b'\\n', p)
         quote = True
     else:
-        quote = p[0] == '"' or (quote_spaces and ' ' in p)
+        quote = p[0] == b'"' or (quote_spaces and b' ' in p)
     if quote:
-        extra = GIT_FAST_IMPORT_NEEDS_EXTRA_SPACE_AFTER_QUOTE and ' ' or ''
-        p = '"%s"%s' % (p, extra)
+        extra = GIT_FAST_IMPORT_NEEDS_EXTRA_SPACE_AFTER_QUOTE and b' ' or b''
+        p = b'"%s"%s' % (p, extra)
     return p
 
 
@@ -438,33 +476,42 @@ def format_who_when(fields):
     """Format a tuple of name,email,secs-since-epoch,utc-offset-secs as a string."""
     offset = fields[3]
     if offset < 0:
-        offset_sign = '-'
+        offset_sign = b'-'
         offset = abs(offset)
     else:
-        offset_sign = '+'
-    offset_hours = offset / 3600
-    offset_minutes = offset / 60 - offset_hours * 60
-    offset_str = "%s%02d%02d" % (offset_sign, offset_hours, offset_minutes)
+        offset_sign = b'+'
+    offset_hours = offset // 3600
+    offset_minutes = offset // 60 - offset_hours * 60
+    offset_str = b'%s%02d%02d' % (offset_sign, offset_hours, offset_minutes)
     name = fields[0]
-    if name == '':
-        sep = ''
+
+    if name == b'':
+        sep = b''
     else:
-        sep = ' '
-    if isinstance(name, unicode):
-        name = name.encode('utf8')
+        sep = b' '
+
+    name = utf8_bytes_string(name)
+
     email = fields[1]
-    if isinstance(email, unicode):
-        email = email.encode('utf8')
-    result = "%s%s<%s> %d %s" % (name, sep, email, fields[2], offset_str)
+
+    email = utf8_bytes_string(email)
+
+    result = b'%s%s<%s> %d %s' % (name, sep, email, fields[2], offset_str)
+
     return result
 
 
 def format_property(name, value):
     """Format the name and value (both unicode) of a property as a string."""
-    utf8_name = name.encode('utf8')
+    result = b''
+    utf8_name = utf8_bytes_string(name)
+
     if value is not None:
-        utf8_value = value.encode('utf8')
-        result = "property %s %d %s" % (utf8_name, len(utf8_value), utf8_value)
+        utf8_value = utf8_bytes_string(value)
+        result = b'property %s %d %s' % (
+            utf8_name, len(utf8_value), utf8_value
+        )
     else:
-        result = "property %s" % (utf8_name,)
+        result = b'property %s' % (utf8_name,)
+
     return result
diff --git a/fastimport/dates.py b/fastimport/dates.py
index 96efcf2..3deb8e1 100644
--- a/fastimport/dates.py
+++ b/fastimport/dates.py
@@ -22,8 +22,6 @@ timestamp,timezone where
 * timestamp is seconds since epoch
 * timezone is the offset from UTC in seconds.
 """
-
-
 import time
 
 from fastimport import errors
@@ -31,11 +29,11 @@ from fastimport import errors
 
 def parse_raw(s, lineno=0):
     """Parse a date from a raw string.
-    
+
     The format must be exactly "seconds-since-epoch offset-utc".
     See the spec for details.
     """
-    timestamp_str, timezone_str = s.split(' ', 1)
+    timestamp_str, timezone_str = s.split(b' ', 1)
     timestamp = float(timestamp_str)
     try:
         timezone = parse_tz(timezone_str)
@@ -50,17 +48,22 @@ def parse_tz(tz):
     :return: the timezone offset in seconds.
     """
     # from git_repository.py in bzr-git
-    if tz[0] not in ('+', '-'):
+    sign_byte = tz[0:1]
+    # in python 3 b'+006'[0] would return an integer,
+    # but b'+006'[0:1] return a new bytes string.
+    if sign_byte not in (b'+', b'-'):
         raise ValueError(tz)
-    sign = {'+': +1, '-': -1}[tz[0]]
+
+    sign = {b'+': +1, b'-': -1}[sign_byte]
     hours = int(tz[1:-2])
     minutes = int(tz[-2:])
+
     return sign * 60 * (60 * hours + minutes)
 
 
 def parse_rfc2822(s, lineno=0):
     """Parse a date from a rfc2822 string.
-    
+
     See the spec for details.
     """
     raise NotImplementedError(parse_rfc2822)
@@ -77,7 +80,7 @@ def parse_now(s, lineno=0):
 
 # Lookup tabel of date parsing routines
 DATE_PARSERS_BY_NAME = {
-    'raw':      parse_raw,
-    'rfc2822':  parse_rfc2822,
-    'now':      parse_now,
+    u'raw':      parse_raw,
+    u'rfc2822':  parse_rfc2822,
+    u'now':      parse_now,
     }
diff --git a/fastimport/errors.py b/fastimport/errors.py
index afb78bc..7555628 100644
--- a/fastimport/errors.py
+++ b/fastimport/errors.py
@@ -20,67 +20,11 @@ _LOCATION_FMT = "line %(lineno)d: "
 
 # ImportError is heavily based on BzrError
 
-class ImportError(StandardError):
+class ImportError(Exception):
     """The base exception class for all import processing exceptions."""
 
-    _fmt = "Unknown Import Error"
-
-    def __init__(self, msg=None, **kwds):
-        StandardError.__init__(self)
-        if msg is not None:
-            self._preformatted_string = msg
-        else:
-            self._preformatted_string = None
-            for key, value in kwds.items():
-                setattr(self, key, value)
-
-    def _format(self):
-        s = getattr(self, '_preformatted_string', None)
-        if s is not None:
-            # contains a preformatted message
-            return s
-        try:
-            fmt = self._fmt
-            if fmt:
-                d = dict(self.__dict__)
-                s = fmt % d
-                # __str__() should always return a 'str' object
-                # never a 'unicode' object.
-                return s
-        except (AttributeError, TypeError, NameError, ValueError, KeyError), e:
-            return 'Unprintable exception %s: dict=%r, fmt=%r, error=%r' \
-                % (self.__class__.__name__,
-                   self.__dict__,
-                   getattr(self, '_fmt', None),
-                   e)
-
-    def __unicode__(self):
-        u = self._format()
-        if isinstance(u, str):
-            # Try decoding the str using the default encoding.
-            u = unicode(u)
-        elif not isinstance(u, unicode):
-            # Try to make a unicode object from it, because __unicode__ must
-            # return a unicode object.
-            u = unicode(u)
-        return u
-
-    def __str__(self):
-        s = self._format()
-        if isinstance(s, unicode):
-            s = s.encode('utf8')
-        else:
-            # __str__ must return a str.
-            s = str(s)
-        return s
-
-    def __repr__(self):
-        return '%s(%s)' % (self.__class__.__name__, str(self))
-
-    def __eq__(self, other):
-        if self.__class__ is not other.__class__:
-            return NotImplemented
-        return self.__dict__ == other.__dict__
+    def __init__(self):
+        super(ImportError, self).__init__(self._fmt % self.__dict__)
 
 
 class ParsingError(ImportError):
@@ -89,8 +33,8 @@ class ParsingError(ImportError):
     _fmt = _LOCATION_FMT + "Unknown Import Parsing Error"
 
     def __init__(self, lineno):
-        ImportError.__init__(self)
         self.lineno = lineno
+        ImportError.__init__(self)
 
 
 class MissingBytes(ParsingError):
@@ -100,9 +44,9 @@ class MissingBytes(ParsingError):
         " found %(found)d")
 
     def __init__(self, lineno, expected, found):
-        ParsingError.__init__(self, lineno)
         self.expected = expected
         self.found = found
+        ParsingError.__init__(self, lineno)
 
 
 class MissingTerminator(ParsingError):
@@ -112,8 +56,8 @@ class MissingTerminator(ParsingError):
         "Unexpected EOF - expected '%(terminator)s' terminator")
 
     def __init__(self, lineno, terminator):
-        ParsingError.__init__(self, lineno)
         self.terminator = terminator
+        ParsingError.__init__(self, lineno)
 
 
 class InvalidCommand(ParsingError):
@@ -122,8 +66,8 @@ class InvalidCommand(ParsingError):
     _fmt = (_LOCATION_FMT + "Invalid command '%(cmd)s'")
 
     def __init__(self, lineno, cmd):
-        ParsingError.__init__(self, lineno)
         self.cmd = cmd
+        ParsingError.__init__(self, lineno)
 
 
 class MissingSection(ParsingError):
@@ -132,9 +76,9 @@ class MissingSection(ParsingError):
     _fmt = (_LOCATION_FMT + "Command %(cmd)s is missing section %(section)s")
 
     def __init__(self, lineno, cmd, section):
-        ParsingError.__init__(self, lineno)
         self.cmd = cmd
         self.section = section
+        ParsingError.__init__(self, lineno)
 
 
 class BadFormat(ParsingError):
@@ -144,10 +88,10 @@ class BadFormat(ParsingError):
         "command %(cmd)s: found '%(text)s'")
 
     def __init__(self, lineno, cmd, section, text):
-        ParsingError.__init__(self, lineno)
         self.cmd = cmd
         self.section = section
         self.text = text
+        ParsingError.__init__(self, lineno)
 
 
 class InvalidTimezone(ParsingError):
@@ -157,12 +101,12 @@ class InvalidTimezone(ParsingError):
         "Timezone %(timezone)r could not be converted.%(reason)s")
 
     def __init__(self, lineno, timezone, reason=None):
-        ParsingError.__init__(self, lineno)
         self.timezone = timezone
         if reason:
             self.reason = ' ' + reason
         else:
             self.reason = ''
+        ParsingError.__init__(self, lineno)
 
 
 class PrematureEndOfStream(ParsingError):
@@ -180,8 +124,8 @@ class UnknownDateFormat(ImportError):
     _fmt = ("Unknown date format '%(format)s'")
 
     def __init__(self, format):
-        ImportError.__init__(self)
         self.format = format
+        ImportError.__init__(self)
 
 
 class MissingHandler(ImportError):
@@ -190,8 +134,8 @@ class MissingHandler(ImportError):
     _fmt = ("Missing handler for command %(cmd)s")
 
     def __init__(self, cmd):
-        ImportError.__init__(self)
         self.cmd = cmd
+        ImportError.__init__(self)
 
 
 class UnknownParameter(ImportError):
@@ -200,9 +144,9 @@ class UnknownParameter(ImportError):
     _fmt = ("Unknown parameter - '%(param)s' not in %(knowns)s")
 
     def __init__(self, param, knowns):
-        ImportError.__init__(self)
         self.param = param
         self.knowns = knowns
+        ImportError.__init__(self)
 
 
 class BadRepositorySize(ImportError):
@@ -212,9 +156,9 @@ class BadRepositorySize(ImportError):
         "%(expected)d expected")
 
     def __init__(self, expected, found):
-        ImportError.__init__(self)
         self.expected = expected
         self.found = found
+        ImportError.__init__(self)
 
 
 class BadRestart(ImportError):
@@ -224,8 +168,8 @@ class BadRestart(ImportError):
         "but matching revision-id is unknown")
 
     def __init__(self, commit_id):
-        ImportError.__init__(self)
         self.commit_id = commit_id
+        ImportError.__init__(self)
 
 
 class UnknownFeature(ImportError):
@@ -235,5 +179,5 @@ class UnknownFeature(ImportError):
         "an earlier data format")
 
     def __init__(self, feature):
-        ImportError.__init__(self)
         self.feature = feature
+        ImportError.__init__(self)
diff --git a/fastimport/helpers.py b/fastimport/helpers.py
index a236b7d..c27c436 100644
--- a/fastimport/helpers.py
+++ b/fastimport/helpers.py
@@ -14,6 +14,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 """Miscellaneous useful stuff."""
+import sys
 
 
 def _common_path_and_rest(l1, l2, common=[]):
@@ -21,12 +22,19 @@ def _common_path_and_rest(l1, l2, common=[]):
     if len(l1) < 1: return (common, l1, l2)
     if len(l2) < 1: return (common, l1, l2)
     if l1[0] != l2[0]: return (common, l1, l2)
-    return _common_path_and_rest(l1[1:], l2[1:], common+[l1[0]])
+    return _common_path_and_rest(
+        l1[1:],
+        l2[1:],
+        common + [
+            l1[0:1] # return a byte string in python 3 unlike l1[0] that
+                    # would return an integer.
+        ]
+    )
 
 
 def common_path(path1, path2):
     """Find the common bit of 2 paths."""
-    return ''.join(_common_path_and_rest(path1, path2)[0])
+    return b''.join(_common_path_and_rest(path1, path2)[0])
 
 
 def common_directory(paths):
@@ -38,14 +46,14 @@ def common_directory(paths):
     """
     import posixpath
     def get_dir_with_slash(path):
-        if path == '' or path.endswith('/'):
+        if path == b'' or path.endswith(b'/'):
             return path
         else:
             dirname, basename = posixpath.split(path)
-            if dirname == '':
+            if dirname == b'':
                 return dirname
             else:
-                return dirname + '/'
+                return dirname + b'/'
 
     if not paths:
         return None
@@ -58,8 +66,8 @@ def common_directory(paths):
         return get_dir_with_slash(common)
 
 
-def is_inside(dir, fname):
-    """True if fname is inside dir.
+def is_inside(directory, fname):
+    """True if fname is inside directory.
 
     The parameters should typically be passed to osutils.normpath first, so
     that . and .. and repeated slashes are eliminated, and the separators
@@ -70,16 +78,16 @@ def is_inside(dir, fname):
     """
     # XXX: Most callers of this can actually do something smarter by
     # looking at the inventory
-    if dir == fname:
+    if directory == fname:
         return True
 
-    if dir == '':
+    if directory == b'':
         return True
 
-    if dir[-1] != '/':
-        dir += '/'
+    if not directory.endswith(b'/'):
+        directory += b'/'
 
-    return fname.startswith(dir)
+    return fname.startswith(directory)
 
 
 def is_inside_any(dir_list, fname):
@@ -88,3 +96,98 @@ def is_inside_any(dir_list, fname):
         if is_inside(dirname, fname):
             return True
     return False
+
+
+def utf8_bytes_string(s):
+    """Convert a string to a bytes string encoded in utf8"""
+    if sys.version_info[0] == 2:
+        return s.encode('utf8')
+    else:
+        if isinstance(s, str):
+            return bytes(s, encoding='utf8')
+        else:
+            return s
+
+
+def repr_bytes(obj):
+    """Return a bytes representation of the object"""
+    if sys.version_info[0] == 2:
+        return repr(obj)
+    else:
+        return bytes(obj)
+
+
+class newobject(object):
+    """
+    A magical object class that provides Python 2 compatibility methods::
+        next
+        __unicode__
+        __nonzero__
+
+    Subclasses of this class can merely define the Python 3 methods (__next__,
+    __str__, and __bool__).
+
+    This is a copy/paste of the future.types.newobject class of the future
+    package.
+    """
+    def next(self):
+        if hasattr(self, '__next__'):
+            return type(self).__next__(self)
+        raise TypeError('newobject is not an iterator')
+
+    def __unicode__(self):
+        # All subclasses of the builtin object should have __str__ defined.
+        # Note that old-style classes do not have __str__ defined.
+        if hasattr(self, '__str__'):
+            s = type(self).__str__(self)
+        else:
+            s = str(self)
+        if isinstance(s, unicode):
+            return s
+        else:
+            return s.decode('utf-8')
+
+    def __nonzero__(self):
+        if hasattr(self, '__bool__'):
+            return type(self).__bool__(self)
+        # object has no __nonzero__ method
+        return True
+
+    # Are these ever needed?
+    # def __div__(self):
+    #     return self.__truediv__()
+
+    # def __idiv__(self, other):
+    #     return self.__itruediv__(other)
+
+    def __long__(self):
+        if not hasattr(self, '__int__'):
+            return NotImplemented
+        return self.__int__()  # not type(self).__int__(self)
+
+    # def __new__(cls, *args, **kwargs):
+    #     """
+    #     dict() -> new empty dictionary
+    #     dict(mapping) -> new dictionary initialized from a mapping object's
+    #         (key, value) pairs
+    #     dict(iterable) -> new dictionary initialized as if via:
+    #         d = {}
+    #         for k, v in iterable:
+    #             d[k] = v
+    #     dict(**kwargs) -> new dictionary initialized with the name=value pairs
+    #         in the keyword argument list.  For example:  dict(one=1, two=2)
+    #     """
+
+    #     if len(args) == 0:
+    #         return super(newdict, cls).__new__(cls)
+    #     elif type(args[0]) == newdict:
+    #         return args[0]
+    #     else:
+    #         value = args[0]
+    #     return super(newdict, cls).__new__(cls, value)
+
+    def __native__(self):
+        """
+        Hook for the future.utils.native() function
+        """
+        return object(self)
diff --git a/fastimport/parser.py b/fastimport/parser.py
index 5f81eef..7fd0a08 100644
--- a/fastimport/parser.py
+++ b/fastimport/parser.py
@@ -157,29 +157,34 @@ The grammar is:
   comment ::= '#' not_lf* lf;
   not_lf  ::= # Any byte that is not ASCII newline (LF);
 """
-
+from __future__ import print_function
 
 import collections
 import re
 import sys
+import codecs
 
 from fastimport import (
     commands,
     dates,
     errors,
     )
+from fastimport.helpers import (
+    newobject as object,
+    utf8_bytes_string,
+    )
 
 
 ## Stream parsing ##
 
 class LineBasedParser(object):
 
-    def __init__(self, input):
+    def __init__(self, input_stream):
         """A Parser that keeps track of line numbers.
 
         :param input: the file-like object to read from
         """
-        self.input = input
+        self.input = input_stream
         self.lineno = 0
         # Lines pushed back onto the input stream
         self._buffer = []
@@ -210,7 +215,7 @@ class LineBasedParser(object):
         :param line: the line with no trailing newline
         """
         self.lineno -= 1
-        self._buffer.append(line + "\n")
+        self._buffer.append(line + b'\n')
 
     def read_bytes(self, count):
         """Read a given number of bytes from the input stream.
@@ -223,7 +228,7 @@ class LineBasedParser(object):
         """
         result = self.input.read(count)
         found = len(result)
-        self.lineno += result.count("\n")
+        self.lineno += result.count(b'\n')
         if found != count:
             self.abort(errors.MissingBytes, count, found)
         return result
@@ -239,38 +244,38 @@ class LineBasedParser(object):
         """
 
         lines = []
-        term = terminator + '\n'
+        term = terminator + b'\n'
         while True:
             line = self.input.readline()
             if line == term:
                 break
             else:
                 lines.append(line)
-        return ''.join(lines)
+        return b''.join(lines)
 
 
 # Regular expression used for parsing. (Note: The spec states that the name
 # part should be non-empty but git-fast-export doesn't always do that so
 # the first bit is \w*, not \w+.) Also git-fast-import code says the
 # space before the email is optional.
-_WHO_AND_WHEN_RE = re.compile(r'([^<]*)<(.*)> (.+)')
-_WHO_RE = re.compile(r'([^<]*)<(.*)>')
+_WHO_AND_WHEN_RE = re.compile(br'([^<]*)<(.*)> (.+)')
+_WHO_RE = re.compile(br'([^<]*)<(.*)>')
 
 
 class ImportParser(LineBasedParser):
 
-    def __init__(self, input, verbose=False, output=sys.stdout,
+    def __init__(self, input_stream, verbose=False, output=sys.stdout,
         user_mapper=None, strict=True):
         """A Parser of import commands.
 
-        :param input: the file-like object to read from
+        :param input_stream: the file-like object to read from
         :param verbose: display extra information of not
         :param output: the file-like object to write messages to (YAGNI?)
         :param user_mapper: if not None, the UserMapper used to adjust
           user-ids for authors, committers and taggers.
         :param strict: Raise errors on strictly invalid data
         """
-        LineBasedParser.__init__(self, input)
+        LineBasedParser.__init__(self, input_stream)
         self.verbose = verbose
         self.output = output
         self.user_mapper = user_mapper
@@ -287,28 +292,28 @@ class ImportParser(LineBasedParser):
         while True:
             line = self.next_line()
             if line is None:
-                if 'done' in self.features:
+                if b'done' in self.features:
                     raise errors.PrematureEndOfStream(self.lineno)
                 break
-            elif len(line) == 0 or line.startswith('#'):
+            elif len(line) == 0 or line.startswith(b'#'):
                 continue
             # Search for commands in order of likelihood
-            elif line.startswith('commit '):
-                yield self._parse_commit(line[len('commit '):])
-            elif line.startswith('blob'):
+            elif line.startswith(b'commit '):
+                yield self._parse_commit(line[len(b'commit '):])
+            elif line.startswith(b'blob'):
                 yield self._parse_blob()
-            elif line.startswith('done'):
+            elif line.startswith(b'done'):
                 break
-            elif line.startswith('progress '):
-                yield commands.ProgressCommand(line[len('progress '):])
-            elif line.startswith('reset '):
-                yield self._parse_reset(line[len('reset '):])
-            elif line.startswith('tag '):
-                yield self._parse_tag(line[len('tag '):])
-            elif line.startswith('checkpoint'):
+            elif line.startswith(b'progress '):
+                yield commands.ProgressCommand(line[len(b'progress '):])
+            elif line.startswith(b'reset '):
+                yield self._parse_reset(line[len(b'reset '):])
+            elif line.startswith(b'tag '):
+                yield self._parse_tag(line[len(b'tag '):])
+            elif line.startswith(b'checkpoint'):
                 yield commands.CheckpointCommand()
-            elif line.startswith('feature'):
-                yield self._parse_feature(line[len('feature '):])
+            elif line.startswith(b'feature'):
+                yield self._parse_feature(line[len(b'feature '):])
             else:
                 self.abort(errors.InvalidCommand, line)
 
@@ -322,21 +327,21 @@ class ImportParser(LineBasedParser):
             line = self.next_line()
             if line is None:
                 break
-            elif len(line) == 0 or line.startswith('#'):
+            elif len(line) == 0 or line.startswith(b'#'):
                 continue
             # Search for file commands in order of likelihood
-            elif line.startswith('M '):
+            elif line.startswith(b'M '):
                 yield self._parse_file_modify(line[2:])
-            elif line.startswith('D '):
+            elif line.startswith(b'D '):
                 path = self._path(line[2:])
                 yield commands.FileDeleteCommand(path)
-            elif line.startswith('R '):
+            elif line.startswith(b'R '):
                 old, new = self._path_pair(line[2:])
                 yield commands.FileRenameCommand(old, new)
-            elif line.startswith('C '):
+            elif line.startswith(b'C '):
                 src, dest = self._path_pair(line[2:])
                 yield commands.FileCopyCommand(src, dest)
-            elif line.startswith('deleteall'):
+            elif line.startswith(b'deleteall'):
                 yield commands.FileDeleteAllCommand()
             else:
                 self.push_line(line)
@@ -346,23 +351,23 @@ class ImportParser(LineBasedParser):
         """Parse a blob command."""
         lineno = self.lineno
         mark = self._get_mark_if_any()
-        data = self._get_data('blob')
+        data = self._get_data(b'blob')
         return commands.BlobCommand(mark, data, lineno)
 
     def _parse_commit(self, ref):
         """Parse a commit command."""
         lineno  = self.lineno
         mark = self._get_mark_if_any()
-        author = self._get_user_info('commit', 'author', False)
+        author = self._get_user_info(b'commit', b'author', False)
         more_authors = []
         while True:
-            another_author = self._get_user_info('commit', 'author', False)
+            another_author = self._get_user_info(b'commit', b'author', False)
             if another_author is not None:
                 more_authors.append(another_author)
             else:
                 break
-        committer = self._get_user_info('commit', 'committer')
-        message = self._get_data('commit', 'message')
+        committer = self._get_user_info(b'commit', b'committer')
+        message = self._get_data(b'commit', b'message')
         from_ = self._get_from()
         merges = []
         while True:
@@ -371,7 +376,7 @@ class ImportParser(LineBasedParser):
                 # while the spec suggests it's illegal, git-fast-export
                 # outputs multiple merges on the one line, e.g.
                 # merge :x :y :z
-                these_merges = merge.split(" ")
+                these_merges = merge.split(b' ')
                 merges.extend(these_merges)
             else:
                 break
@@ -389,7 +394,7 @@ class ImportParser(LineBasedParser):
 
     def _parse_feature(self, info):
         """Parse a feature command."""
-        parts = info.split("=", 1)
+        parts = info.split(b'=', 1)
         name = parts[0]
         if len(parts) > 1:
             value = self._path(parts[1])
@@ -404,12 +409,12 @@ class ImportParser(LineBasedParser):
         :param info: a string in the format "mode dataref path"
           (where dataref might be the hard-coded literal 'inline').
         """
-        params = info.split(' ', 2)
+        params = info.split(b' ', 2)
         path = self._path(params[2])
         mode = self._mode(params[0])
-        if params[1] == 'inline':
+        if params[1] == b'inline':
             dataref = None
-            data = self._get_data('filemodify')
+            data = self._get_data(b'filemodify')
         else:
             dataref = params[1]
             data = None
@@ -423,17 +428,17 @@ class ImportParser(LineBasedParser):
 
     def _parse_tag(self, name):
         """Parse a tag command."""
-        from_ = self._get_from('tag')
-        tagger = self._get_user_info('tag', 'tagger',
+        from_ = self._get_from(b'tag')
+        tagger = self._get_user_info(b'tag', b'tagger',
                 accept_just_who=True)
-        message = self._get_data('tag', 'message')
+        message = self._get_data(b'tag', b'message')
         return commands.TagCommand(name, from_, tagger, message)
 
     def _get_mark_if_any(self):
         """Parse a mark section."""
         line = self.next_line()
-        if line.startswith('mark :'):
-            return line[len('mark :'):]
+        if line.startswith(b'mark :'):
+            return line[len(b'mark :'):]
         else:
             self.push_line(line)
             return None
@@ -443,8 +448,8 @@ class ImportParser(LineBasedParser):
         line = self.next_line()
         if line is None:
             return None
-        elif line.startswith('from '):
-            return line[len('from '):]
+        elif line.startswith(b'from '):
+            return line[len(b'from '):]
         elif required_for:
             self.abort(errors.MissingSection, required_for, 'from')
         else:
@@ -456,8 +461,8 @@ class ImportParser(LineBasedParser):
         line = self.next_line()
         if line is None:
             return None
-        elif line.startswith('merge '):
-            return line[len('merge '):]
+        elif line.startswith(b'merge '):
+            return line[len(b'merge '):]
         else:
             self.push_line(line)
             return None
@@ -467,8 +472,8 @@ class ImportParser(LineBasedParser):
         line = self.next_line()
         if line is None:
             return None
-        elif line.startswith('property '):
-            return self._name_value(line[len('property '):])
+        elif line.startswith(b'property '):
+            return self._name_value(line[len(b'property '):])
         else:
             self.push_line(line)
             return None
@@ -477,8 +482,8 @@ class ImportParser(LineBasedParser):
         accept_just_who=False):
         """Parse a user section."""
         line = self.next_line()
-        if line.startswith(section + ' '):
-            return self._who_when(line[len(section + ' '):], cmd, section,
+        if line.startswith(section + b' '):
+            return self._who_when(line[len(section + b' '):], cmd, section,
                 accept_just_who=accept_just_who)
         elif required:
             self.abort(errors.MissingSection, cmd, section)
@@ -486,21 +491,21 @@ class ImportParser(LineBasedParser):
             self.push_line(line)
             return None
 
-    def _get_data(self, required_for, section='data'):
+    def _get_data(self, required_for, section=b'data'):
         """Parse a data section."""
         line = self.next_line()
-        if line.startswith('data '):
-            rest = line[len('data '):]
-            if rest.startswith('<<'):
+        if line.startswith(b'data '):
+            rest = line[len(b'data '):]
+            if rest.startswith(b'<<'):
                 return self.read_until(rest[2:])
             else:
                 size = int(rest)
                 read_bytes = self.read_bytes(size)
                 # optional LF after data.
-                next = self.input.readline()
+                next_line = self.input.readline()
                 self.lineno += 1
-                if len(next) > 1 or next != "\n":
-                    self.push_line(next[:-1])
+                if len(next_line) > 1 or next_line != b'\n':
+                    self.push_line(next_line[:-1])
                 return read_bytes
         else:
             self.abort(errors.MissingSection, required_for, section)
@@ -516,19 +521,19 @@ class ImportParser(LineBasedParser):
             datestr = match.group(3).lstrip()
             if self.date_parser is None:
                 # auto-detect the date format
-                if len(datestr.split(' ')) == 2:
-                    format = 'raw'
-                elif datestr == 'now':
-                    format = 'now'
+                if len(datestr.split(b' ')) == 2:
+                    date_format = 'raw'
+                elif datestr == b'now':
+                    date_format = 'now'
                 else:
-                    format = 'rfc2822'
-                self.date_parser = dates.DATE_PARSERS_BY_NAME[format]
+                    date_format = 'rfc2822'
+                self.date_parser = dates.DATE_PARSERS_BY_NAME[date_format]
             try:
                 when = self.date_parser(datestr, self.lineno)
             except ValueError:
-                print "failed to parse datestr '%s'" % (datestr,)
+                print("failed to parse datestr '%s'" % (datestr,))
                 raise
-            name = match.group(1)
+            name = match.group(1).rstrip()
             email = match.group(2)
         else:
             match = _WHO_RE.search(s)
@@ -545,18 +550,19 @@ class ImportParser(LineBasedParser):
                 email = None
                 when = dates.DATE_PARSERS_BY_NAME['now']('now')
         if len(name) > 0:
-            if name[-1] == " ":
+            if name.endswith(b' '):
                 name = name[:-1]
         # While it shouldn't happen, some datasets have email addresses
         # which contain unicode characters. See bug 338186. We sanitize
         # the data at this level just in case.
         if self.user_mapper:
             name, email = self.user_mapper.map_name_and_email(name, email)
+
         return Authorship(name, email, when[0], when[1])
 
     def _name_value(self, s):
         """Parse a (name,value) tuple from 'name value-length value'."""
-        parts = s.split(' ', 2)
+        parts = s.split(b' ', 2)
         name = parts[0]
         if len(parts) == 1:
             value = None
@@ -566,14 +572,13 @@ class ImportParser(LineBasedParser):
             still_to_read = size - len(value)
             if still_to_read > 0:
                 read_bytes = self.read_bytes(still_to_read)
-                value += "\n" + read_bytes[:still_to_read - 1]
-            value = value.decode('utf8')
+                value += b'\n' + read_bytes[:still_to_read - 1]
         return (name, value)
 
     def _path(self, s):
         """Parse a path."""
-        if s.startswith('"'):
-            if s[-1] != '"':
+        if s.startswith(b'"'):
+            if not s.endswith(b'"'):
                 self.abort(errors.BadFormat, '?', '?', s)
             else:
                 return _unquote_c_string(s[1:-1])
@@ -582,17 +587,17 @@ class ImportParser(LineBasedParser):
     def _path_pair(self, s):
         """Parse two paths separated by a space."""
         # TODO: handle a space in the first path
-        if s.startswith('"'):
-            parts = s[1:].split('" ', 1)
+        if s.startswith(b'"'):
+            parts = s[1:].split(b'" ', 1)
         else:
-            parts = s.split(' ', 1)
+            parts = s.split(b' ', 1)
         if len(parts) != 2:
             self.abort(errors.BadFormat, '?', '?', s)
-        elif parts[1].startswith('"') and parts[1].endswith('"'):
+        elif parts[1].startswith(b'"') and parts[1].endswith(b'"'):
             parts[1] = parts[1][1:-1]
-        elif parts[1].startswith('"') or parts[1].endswith('"'):
+        elif parts[1].startswith(b'"') or parts[1].endswith(b'"'):
             self.abort(errors.BadFormat, '?', '?', s)
-        return map(_unquote_c_string, parts)
+        return [_unquote_c_string(s) for s in parts]
 
     def _mode(self, s):
         """Check file mode format and parse into an int.
@@ -600,23 +605,55 @@ class ImportParser(LineBasedParser):
         :return: mode as integer
         """
         # Note: Output from git-fast-export slightly different to spec
-        if s in ['644', '100644', '0100644']:
-            return 0100644
-        elif s in ['755', '100755', '0100755']:
-            return 0100755
-        elif s in ['040000', '0040000']:
-            return 040000
-        elif s in ['120000', '0120000']:
-            return 0120000
-        elif s in ['160000', '0160000']:
-            return 0160000
+        if s in [b'644', b'100644', b'0100644']:
+            return 0o100644
+        elif s in [b'755', b'100755', b'0100755']:
+            return 0o100755
+        elif s in [b'040000', b'0040000']:
+            return 0o40000
+        elif s in [b'120000', b'0120000']:
+            return 0o120000
+        elif s in [b'160000', b'0160000']:
+            return 0o160000
         else:
             self.abort(errors.BadFormat, 'filemodify', 'mode', s)
 
 
+ESCAPE_SEQUENCE_BYTES_RE = re.compile(br'''
+    ( \\U........      # 8-digit hex escapes
+    | \\u....          # 4-digit hex escapes
+    | \\x..            # 2-digit hex escapes
+    | \\[0-7]{1,3}     # Octal escapes
+    | \\N\{[^}]+\}     # Unicode characters by name
+    | \\[\\'"abfnrtv]  # Single-character escapes
+    )''', re.VERBOSE
+)
+
+ESCAPE_SEQUENCE_RE = re.compile(r'''
+    ( \\U........
+    | \\u....
+    | \\x..
+    | \\[0-7]{1,3}
+    | \\N\{[^}]+\}
+    | \\[\\'"abfnrtv]
+    )''', re.UNICODE | re.VERBOSE
+)
+
 def _unquote_c_string(s):
-    """replace C-style escape sequences (\n, \", etc.) with real chars."""
-    # HACK: Python strings are close enough
-    return s.decode('string_escape', 'replace')
+     """replace C-style escape sequences (\n, \", etc.) with real chars."""
+
+     # doing a s.encode('utf-8').decode('unicode_escape') can return an
+     # incorrect output with unicode string (both in py2 and py3) the safest way
+     # is to match the escape sequences and decoding them alone.
+     def decode_match(match):
+          return utf8_bytes_string(
+               codecs.decode(match.group(0), 'unicode-escape')
+          )
+
+     if sys.version_info[0] >= 3 and isinstance(s, bytes):
+          return ESCAPE_SEQUENCE_BYTES_RE.sub(decode_match, s)
+     else:
+          return ESCAPE_SEQUENCE_RE.sub(decode_match, s)
+
 
 Authorship = collections.namedtuple('Authorship', 'name email timestamp timezone')
diff --git a/fastimport/processor.py b/fastimport/processor.py
index 13aa987..1eb33cb 100644
--- a/fastimport/processor.py
+++ b/fastimport/processor.py
@@ -29,11 +29,11 @@ To import from a fast-import stream to your version-control system:
 See git-fast-import.1 for the meaning of each command and the
 processors package for examples.
 """
-
 import sys
 import time
 
-import errors
+from fastimport import errors
+from fastimport.helpers import newobject as object
 
 
 class ImportProcessor(object):
@@ -78,7 +78,8 @@ class ImportProcessor(object):
         self.pre_process()
         for cmd in command_iter():
             try:
-                handler = getattr(self.__class__, cmd.name + "_handler")
+                name = (cmd.name + b'_handler').decode('utf8')
+                handler = getattr(self.__class__, name)
             except KeyError:
                 raise errors.MissingHandler(cmd.name)
             else:
@@ -149,7 +150,7 @@ class ImportProcessor(object):
 
 class CommitHandler(object):
     """Base class for commit handling.
-    
+
     Subclasses should override the pre_*, post_* and *_handler
     methods as appropriate.
     """
@@ -161,7 +162,8 @@ class CommitHandler(object):
         self.pre_process_files()
         for fc in self.command.iter_files():
             try:
-                handler = getattr(self.__class__, fc.name[4:] + "_handler")
+                name = (fc.name[4:] + b'_handler').decode('utf8')
+                handler = getattr(self.__class__, name)
             except KeyError:
                 raise errors.MissingHandler(fc.name)
             else:
diff --git a/fastimport/processors/filter_processor.py b/fastimport/processors/filter_processor.py
index b84a009..0ca4472 100644
--- a/fastimport/processors/filter_processor.py
+++ b/fastimport/processors/filter_processor.py
@@ -14,8 +14,6 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 """Import processor that filters the input (and doesn't import)."""
-
-
 from fastimport import (
     commands,
     helpers,
@@ -42,16 +40,16 @@ class FilterProcessor(processor.ImportProcessor):
     """
 
     known_params = [
-        'include_paths',
-        'exclude_paths',
-        'squash_empty_commits'
-        ]
+        b'include_paths',
+        b'exclude_paths',
+        b'squash_empty_commits'
+    ]
 
     def pre_process(self):
-        self.includes = self.params.get('include_paths')
-        self.excludes = self.params.get('exclude_paths')
+        self.includes = self.params.get(b'include_paths')
+        self.excludes = self.params.get(b'exclude_paths')
         self.squash_empty_commits = bool(
-            self.params.get('squash_empty_commits', True))
+            self.params.get(b'squash_empty_commits', True))
         # What's the new root, if any
         self.new_root = helpers.common_directory(self.includes)
         # Buffer of blobs until we know we need them: mark -> cmd
@@ -128,7 +126,7 @@ class FilterProcessor(processor.ImportProcessor):
         else:
             parents = None
         if cmd.mark is not None:
-            self.parents[":" + cmd.mark] = parents
+            self.parents[b':' + cmd.mark] = parents
 
     def reset_handler(self, cmd):
         """Process a ResetCommand."""
@@ -158,14 +156,14 @@ class FilterProcessor(processor.ImportProcessor):
 
     def _print_command(self, cmd):
         """Wrapper to avoid adding unnecessary blank lines."""
-        text = repr(cmd)
+        text = helpers.repr_bytes(cmd)
         self.outf.write(text)
-        if not text.endswith("\n"):
-            self.outf.write("\n")
+        if not text.endswith(b'\n'):
+            self.outf.write(b'\n')
 
     def _filter_filecommands(self, filecmd_iter):
         """Return the filecommands filtered by includes & excludes.
-        
+
         :return: a list of FileCommand objects
         """
         if self.includes is None and self.excludes is None:
@@ -242,7 +240,7 @@ class FilterProcessor(processor.ImportProcessor):
 
     def _convert_rename(self, fc):
         """Convert a FileRenameCommand into a new FileCommand.
-        
+
         :return: None if the rename is being ignored, otherwise a
           new FileCommand based on the whether the old and new paths
           are inside or outside of the interesting locations.
@@ -273,7 +271,7 @@ class FilterProcessor(processor.ImportProcessor):
 
     def _convert_copy(self, fc):
         """Convert a FileCopyCommand into a new FileCommand.
-        
+
         :return: None if the copy is being ignored, otherwise a
           new FileCommand based on the whether the source and destination
           paths are inside or outside of the interesting locations.
diff --git a/fastimport/processors/query_processor.py b/fastimport/processors/query_processor.py
index c2e750f..a40f2d6 100644
--- a/fastimport/processors/query_processor.py
+++ b/fastimport/processors/query_processor.py
@@ -14,6 +14,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 """Import processor that queries the input (and doesn't import)."""
+from __future__ import print_function
 
 
 from fastimport import (
@@ -28,8 +29,11 @@ class QueryProcessor(processor.ImportProcessor):
     No changes to the current repository are made.
     """
 
-    known_params = commands.COMMAND_NAMES + commands.FILE_COMMAND_NAMES + \
-        ['commit-mark']
+    known_params = (
+        commands.COMMAND_NAMES +
+        commands.FILE_COMMAND_NAMES +
+        [b'commit-mark']
+    )
 
     def __init__(self, params=None, verbose=False):
         processor.ImportProcessor.__init__(self, params, verbose)
@@ -40,7 +44,7 @@ class QueryProcessor(processor.ImportProcessor):
             if 'commit-mark' in params:
                 self.interesting_commit = params['commit-mark']
                 del params['commit-mark']
-            for name, value in params.iteritems():
+            for name, value in params.items():
                 if value == 1:
                     # All fields
                     fields = None
@@ -54,13 +58,13 @@ class QueryProcessor(processor.ImportProcessor):
             return
         if self.interesting_commit and cmd.name == 'commit':
             if cmd.mark == self.interesting_commit:
-                print cmd.to_string()
+                print(cmd.to_string())
                 self._finished = True
             return
-        if self.parsed_params.has_key(cmd.name):
+        if cmd.name in self.parsed_params:
             fields = self.parsed_params[cmd.name]
             str = cmd.dump_str(fields, self.parsed_params, self.verbose)
-            print "%s" % (str,)
+            print("%s" % (str,))
 
     def progress_handler(self, cmd):
         """Process a ProgressCommand."""
diff --git a/fastimport/tests/__init__.py b/fastimport/tests/__init__.py
index 1d3a09e..ae5acb7 100644
--- a/fastimport/tests/__init__.py
+++ b/fastimport/tests/__init__.py
@@ -34,4 +34,5 @@ def test_suite():
     loader = unittest.TestLoader()
     suite = loader.loadTestsFromNames(module_names)
     result.addTests(suite)
+
     return result
diff --git a/fastimport/tests/test_commands.py b/fastimport/tests/test_commands.py
index 0da0773..08fd764 100644
--- a/fastimport/tests/test_commands.py
+++ b/fastimport/tests/test_commands.py
@@ -17,6 +17,11 @@
 
 from unittest import TestCase
 
+from fastimport.helpers import (
+    repr_bytes,
+    utf8_bytes_string,
+    )
+
 from fastimport import (
     commands,
     )
@@ -25,204 +30,208 @@ from fastimport import (
 class TestBlobDisplay(TestCase):
 
     def test_blob(self):
-        c = commands.BlobCommand("1", "hello world")
-        self.assertEqual("blob\nmark :1\ndata 11\nhello world", repr(c))
+        c = commands.BlobCommand(b"1", b"hello world")
+        self.assertEqual(b"blob\nmark :1\ndata 11\nhello world", repr_bytes(c))
 
     def test_blob_no_mark(self):
-        c = commands.BlobCommand(None, "hello world")
-        self.assertEqual("blob\ndata 11\nhello world", repr(c))
+        c = commands.BlobCommand(None, b"hello world")
+        self.assertEqual(b"blob\ndata 11\nhello world", repr_bytes(c))
 
 
 class TestCheckpointDisplay(TestCase):
 
     def test_checkpoint(self):
         c = commands.CheckpointCommand()
-        self.assertEqual("checkpoint", repr(c))
+        self.assertEqual(b'checkpoint', repr_bytes(c))
 
 
 class TestCommitDisplay(TestCase):
 
     def test_commit(self):
         # user tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
-        committer = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
-        c = commands.CommitCommand("refs/heads/master", "bbb", None, committer,
-            "release v1.0", ":aaa", None, None)
+        committer = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
+        c = commands.CommitCommand(b"refs/heads/master", b"bbb", None, committer,
+            b"release v1.0", b":aaa", None, None)
         self.assertEqual(
-            "commit refs/heads/master\n"
-            "mark :bbb\n"
-            "committer Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 12\n"
-            "release v1.0\n"
-            "from :aaa",
-            repr(c))
+            b"commit refs/heads/master\n"
+            b"mark :bbb\n"
+            b"committer Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 12\n"
+            b"release v1.0\n"
+            b"from :aaa",
+            repr_bytes(c))
 
     def test_commit_unicode_committer(self):
         # user tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
         name = u'\u013d\xf3r\xe9m \xcdp\u0161\xfam'
-        name_utf8 = name.encode('utf8')
-        committer = (name, 'test@example.com', 1234567890, -6 * 3600)
-        c = commands.CommitCommand("refs/heads/master", "bbb", None, committer,
-            "release v1.0", ":aaa", None, None)
-        self.assertEqual(
-            "commit refs/heads/master\n"
-            "mark :bbb\n"
-            "committer %s <test@example.com> 1234567890 -0600\n"
-            "data 12\n"
-            "release v1.0\n"
-            "from :aaa" % (name_utf8,),
-            repr(c))
+
+        commit_utf8 = utf8_bytes_string(
+            u"commit refs/heads/master\n"
+            u"mark :bbb\n"
+            u"committer %s <test@example.com> 1234567890 -0600\n"
+            u"data 12\n"
+            u"release v1.0\n"
+            u"from :aaa" % (name,)
+        )
+
+        committer = (name, b'test@example.com', 1234567890, -6 * 3600)
+        c = commands.CommitCommand(b'refs/heads/master', b'bbb', None, committer,
+            b'release v1.0', b':aaa', None, None)
+
+        self.assertEqual(commit_utf8, repr_bytes(c))
 
     def test_commit_no_mark(self):
         # user tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
-        committer = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
-        c = commands.CommitCommand("refs/heads/master", None, None, committer,
-            "release v1.0", ":aaa", None, None)
+        committer = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
+        c = commands.CommitCommand(b'refs/heads/master', None, None, committer,
+           b'release v1.0', b':aaa', None, None)
         self.assertEqual(
-            "commit refs/heads/master\n"
-            "committer Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 12\n"
-            "release v1.0\n"
-            "from :aaa",
-            repr(c))
+            b"commit refs/heads/master\n"
+            b"committer Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 12\n"
+            b"release v1.0\n"
+            b"from :aaa",
+            repr_bytes(c))
 
     def test_commit_no_from(self):
         # user tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
-        committer = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
-        c = commands.CommitCommand("refs/heads/master", "bbb", None, committer,
-            "release v1.0", None, None, None)
+        committer = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
+        c = commands.CommitCommand(b"refs/heads/master", b"bbb", None, committer,
+            b"release v1.0", None, None, None)
         self.assertEqual(
-            "commit refs/heads/master\n"
-            "mark :bbb\n"
-            "committer Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 12\n"
-            "release v1.0",
-            repr(c))
+            b"commit refs/heads/master\n"
+            b"mark :bbb\n"
+            b"committer Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 12\n"
+            b"release v1.0",
+            repr_bytes(c))
 
     def test_commit_with_author(self):
         # user tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
-        author = ('Sue Wong', 'sue@example.com', 1234565432, -6 * 3600)
-        committer = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
-        c = commands.CommitCommand("refs/heads/master", "bbb", author,
-            committer, "release v1.0", ":aaa", None, None)
+        author = (b'Sue Wong', b'sue@example.com', 1234565432, -6 * 3600)
+        committer = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
+        c = commands.CommitCommand(b'refs/heads/master', b'bbb', author,
+            committer, b'release v1.0', b':aaa', None, None)
         self.assertEqual(
-            "commit refs/heads/master\n"
-            "mark :bbb\n"
-            "author Sue Wong <sue@example.com> 1234565432 -0600\n"
-            "committer Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 12\n"
-            "release v1.0\n"
-            "from :aaa",
-            repr(c))
+            b"commit refs/heads/master\n"
+            b"mark :bbb\n"
+            b"author Sue Wong <sue@example.com> 1234565432 -0600\n"
+            b"committer Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 12\n"
+            b"release v1.0\n"
+            b"from :aaa",
+            repr_bytes(c))
 
     def test_commit_with_merges(self):
         # user tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
-        committer = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
-        c = commands.CommitCommand("refs/heads/master", "ddd", None, committer,
-                "release v1.0", ":aaa", [':bbb', ':ccc'], None)
+        committer = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
+        c = commands.CommitCommand(b"refs/heads/master", b"ddd", None, committer,
+                b'release v1.0', b":aaa", [b':bbb', b':ccc'], None)
         self.assertEqual(
-            "commit refs/heads/master\n"
-            "mark :ddd\n"
-            "committer Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 12\n"
-            "release v1.0\n"
-            "from :aaa\n"
-            "merge :bbb\n"
-            "merge :ccc",
-            repr(c))
+            b"commit refs/heads/master\n"
+            b"mark :ddd\n"
+            b"committer Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 12\n"
+            b"release v1.0\n"
+            b"from :aaa\n"
+            b"merge :bbb\n"
+            b"merge :ccc",
+            repr_bytes(c))
 
     def test_commit_with_filecommands(self):
         file_cmds = iter([
-            commands.FileDeleteCommand('readme.txt'),
-            commands.FileModifyCommand('NEWS', 0100644, None,
-                'blah blah blah'),
+            commands.FileDeleteCommand(b'readme.txt'),
+            commands.FileModifyCommand(b'NEWS', 0o100644, None,
+                b'blah blah blah'),
             ])
         # user tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
-        committer = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
-        c = commands.CommitCommand("refs/heads/master", "bbb", None, committer,
-            "release v1.0", ":aaa", None, file_cmds)
+        committer = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
+        c = commands.CommitCommand(b'refs/heads/master', b'bbb', None, committer,
+            b'release v1.0', b':aaa', None, file_cmds)
         self.assertEqual(
-            "commit refs/heads/master\n"
-            "mark :bbb\n"
-            "committer Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 12\n"
-            "release v1.0\n"
-            "from :aaa\n"
-            "D readme.txt\n"
-            "M 644 inline NEWS\n"
-            "data 14\n"
-            "blah blah blah",
-            repr(c))
+            b"commit refs/heads/master\n"
+            b"mark :bbb\n"
+            b"committer Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 12\n"
+            b"release v1.0\n"
+            b"from :aaa\n"
+            b"D readme.txt\n"
+            b"M 644 inline NEWS\n"
+            b"data 14\n"
+            b"blah blah blah",
+            repr_bytes(c))
 
     def test_commit_with_more_authors(self):
         # user tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
-        author = ('Sue Wong', 'sue@example.com', 1234565432, -6 * 3600)
-        committer = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
+        author = (b'Sue Wong', b'sue@example.com', 1234565432, -6 * 3600)
+        committer = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
         more_authors = [
-            ('Al Smith', 'al@example.com', 1234565432, -6 * 3600),
-            ('Bill Jones', 'bill@example.com', 1234565432, -6 * 3600),
-            ]
-        c = commands.CommitCommand("refs/heads/master", "bbb", author,
-            committer, "release v1.0", ":aaa", None, None,
+            (b'Al Smith', b'al@example.com', 1234565432, -6 * 3600),
+            (b'Bill Jones', b'bill@example.com', 1234565432, -6 * 3600),
+        ]
+        c = commands.CommitCommand(b'refs/heads/master', b'bbb', author,
+            committer, b'release v1.0', b':aaa', None, None,
             more_authors=more_authors)
         self.assertEqual(
-            "commit refs/heads/master\n"
-            "mark :bbb\n"
-            "author Sue Wong <sue@example.com> 1234565432 -0600\n"
-            "author Al Smith <al@example.com> 1234565432 -0600\n"
-            "author Bill Jones <bill@example.com> 1234565432 -0600\n"
-            "committer Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 12\n"
-            "release v1.0\n"
-            "from :aaa",
-            repr(c))
+            b"commit refs/heads/master\n"
+            b"mark :bbb\n"
+            b"author Sue Wong <sue@example.com> 1234565432 -0600\n"
+            b"author Al Smith <al@example.com> 1234565432 -0600\n"
+            b"author Bill Jones <bill@example.com> 1234565432 -0600\n"
+            b"committer Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 12\n"
+            b"release v1.0\n"
+            b"from :aaa",
+            repr_bytes(c))
 
     def test_commit_with_properties(self):
         # user tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
-        committer = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
+        committer = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
         properties = {
             u'greeting':  u'hello',
             u'planet':    u'world',
             }
-        c = commands.CommitCommand("refs/heads/master", "bbb", None,
-            committer, "release v1.0", ":aaa", None, None,
+        c = commands.CommitCommand(b'refs/heads/master', b'bbb', None,
+            committer, b'release v1.0', b':aaa', None, None,
             properties=properties)
         self.assertEqual(
-            "commit refs/heads/master\n"
-            "mark :bbb\n"
-            "committer Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 12\n"
-            "release v1.0\n"
-            "from :aaa\n"
-            "property greeting 5 hello\n"
-            "property planet 5 world",
-            repr(c))
+            b"commit refs/heads/master\n"
+            b"mark :bbb\n"
+            b"committer Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 12\n"
+            b"release v1.0\n"
+            b"from :aaa\n"
+            b"property greeting 5 hello\n"
+            b"property planet 5 world",
+            repr_bytes(c))
 
 class TestCommitCopy(TestCase):
 
     def setUp(self):
         super(TestCommitCopy, self).setUp()
         file_cmds = iter([
-            commands.FileDeleteCommand('readme.txt'),
-            commands.FileModifyCommand('NEWS', 0100644, None, 'blah blah blah'),
+            commands.FileDeleteCommand(b'readme.txt'),
+            commands.FileModifyCommand(b'NEWS', 0o100644, None, b'blah blah blah'),
         ])
 
-        committer = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
+        committer = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
         self.c = commands.CommitCommand(
-            "refs/heads/master", "bbb", None, committer,
-            "release v1.0", ":aaa", None, file_cmds)
+            b'refs/heads/master', b'bbb', None, committer,
+            b'release v1.0', b':aaa', None, file_cmds)
 
     def test_simple_copy(self):
         c2 = self.c.copy()
 
         self.assertFalse(self.c is c2)
-        self.assertEqual(repr(self.c), repr(c2))
+        self.assertEqual(repr_bytes(self.c), repr_bytes(c2))
 
     def test_replace_attr(self):
-        c2 = self.c.copy(mark='ccc')
+        c2 = self.c.copy(mark=b'ccc')
 
         self.assertEqual(
-            repr(self.c).replace('mark :bbb', 'mark :ccc'),
-            repr(c2))
+            repr_bytes(self.c).replace(b'mark :bbb', b'mark :ccc'),
+            repr_bytes(c2)
+        )
 
     def test_invalid_attribute(self):
         self.assertRaises(TypeError, self.c.copy, invalid=True)
@@ -230,166 +239,166 @@ class TestCommitCopy(TestCase):
 class TestFeatureDisplay(TestCase):
 
     def test_feature(self):
-        c = commands.FeatureCommand("dwim")
-        self.assertEqual("feature dwim", repr(c))
+        c = commands.FeatureCommand(b"dwim")
+        self.assertEqual(b"feature dwim", repr_bytes(c))
 
     def test_feature_with_value(self):
-        c = commands.FeatureCommand("dwim", "please")
-        self.assertEqual("feature dwim=please", repr(c))
+        c = commands.FeatureCommand(b"dwim", b"please")
+        self.assertEqual(b"feature dwim=please", repr_bytes(c))
 
 
 class TestProgressDisplay(TestCase):
 
     def test_progress(self):
-        c = commands.ProgressCommand("doing foo")
-        self.assertEqual("progress doing foo", repr(c))
+        c = commands.ProgressCommand(b"doing foo")
+        self.assertEqual(b"progress doing foo", repr_bytes(c))
 
 
 class TestResetDisplay(TestCase):
 
     def test_reset(self):
-        c = commands.ResetCommand("refs/tags/v1.0", ":xxx")
-        self.assertEqual("reset refs/tags/v1.0\nfrom :xxx\n", repr(c))
+        c = commands.ResetCommand(b"refs/tags/v1.0", b":xxx")
+        self.assertEqual(b"reset refs/tags/v1.0\nfrom :xxx\n", repr_bytes(c))
 
     def test_reset_no_from(self):
-        c = commands.ResetCommand("refs/remotes/origin/master", None)
-        self.assertEqual("reset refs/remotes/origin/master", repr(c))
+        c = commands.ResetCommand(b'refs/remotes/origin/master', None)
+        self.assertEqual(b'reset refs/remotes/origin/master', repr_bytes(c))
 
 
 class TestTagDisplay(TestCase):
 
     def test_tag(self):
         # tagger tuple is (name, email, secs-since-epoch, secs-offset-from-utc)
-        tagger = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
-        c = commands.TagCommand("refs/tags/v1.0", ":xxx", tagger, "create v1.0")
+        tagger = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
+        c = commands.TagCommand(b'refs/tags/v1.0', b':xxx', tagger, b'create v1.0')
         self.assertEqual(
-            "tag refs/tags/v1.0\n"
-            "from :xxx\n"
-            "tagger Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 11\n"
-            "create v1.0",
-            repr(c))
+            b"tag refs/tags/v1.0\n"
+            b"from :xxx\n"
+            b"tagger Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 11\n"
+            b"create v1.0",
+            repr_bytes(c))
 
     def test_tag_no_from(self):
-        tagger = ('Joe Wong', 'joe@example.com', 1234567890, -6 * 3600)
-        c = commands.TagCommand("refs/tags/v1.0", None, tagger, "create v1.0")
+        tagger = (b'Joe Wong', b'joe@example.com', 1234567890, -6 * 3600)
+        c = commands.TagCommand(b'refs/tags/v1.0', None, tagger, b'create v1.0')
         self.assertEqual(
-            "tag refs/tags/v1.0\n"
-            "tagger Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 11\n"
-            "create v1.0",
-            repr(c))
+            b"tag refs/tags/v1.0\n"
+            b"tagger Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 11\n"
+            b"create v1.0",
+            repr_bytes(c))
 
 
 class TestFileModifyDisplay(TestCase):
 
     def test_filemodify_file(self):
-        c = commands.FileModifyCommand("foo/bar", 0100644, ":23", None)
-        self.assertEqual("M 644 :23 foo/bar", repr(c))
+        c = commands.FileModifyCommand(b'foo/bar', 0o100644, b':23', None)
+        self.assertEqual(b'M 644 :23 foo/bar', repr_bytes(c))
 
     def test_filemodify_file_executable(self):
-        c = commands.FileModifyCommand("foo/bar", 0100755, ":23", None)
-        self.assertEqual("M 755 :23 foo/bar", repr(c))
+        c = commands.FileModifyCommand(b'foo/bar', 0o100755, b':23', None)
+        self.assertEqual(b'M 755 :23 foo/bar', repr_bytes(c))
 
     def test_filemodify_file_internal(self):
-        c = commands.FileModifyCommand("foo/bar", 0100644, None,
-            "hello world")
-        self.assertEqual("M 644 inline foo/bar\ndata 11\nhello world", repr(c))
+        c = commands.FileModifyCommand(b'foo/bar', 0o100644, None,
+            b'hello world')
+        self.assertEqual(b'M 644 inline foo/bar\ndata 11\nhello world', repr_bytes(c))
 
     def test_filemodify_symlink(self):
-        c = commands.FileModifyCommand("foo/bar", 0120000, None, "baz")
-        self.assertEqual("M 120000 inline foo/bar\ndata 3\nbaz", repr(c))
+        c = commands.FileModifyCommand(b'foo/bar', 0o120000, None, b'baz')
+        self.assertEqual(b'M 120000 inline foo/bar\ndata 3\nbaz', repr_bytes(c))
 
     def test_filemodify_treeref(self):
-        c = commands.FileModifyCommand("tree-info", 0160000,
-            "revision-id-info", None)
-        self.assertEqual("M 160000 revision-id-info tree-info", repr(c))
+        c = commands.FileModifyCommand(b'tree-info', 0o160000,
+            b'revision-id-info', None)
+        self.assertEqual(b'M 160000 revision-id-info tree-info', repr_bytes(c))
 
 
 class TestFileDeleteDisplay(TestCase):
 
     def test_filedelete(self):
-        c = commands.FileDeleteCommand("foo/bar")
-        self.assertEqual("D foo/bar", repr(c))
+        c = commands.FileDeleteCommand(b'foo/bar')
+        self.assertEqual(b'D foo/bar', repr_bytes(c))
 
 
 class TestFileCopyDisplay(TestCase):
 
     def test_filecopy(self):
-        c = commands.FileCopyCommand("foo/bar", "foo/baz")
-        self.assertEqual("C foo/bar foo/baz", repr(c))
+        c = commands.FileCopyCommand(b'foo/bar', b'foo/baz')
+        self.assertEqual(b'C foo/bar foo/baz', repr_bytes(c))
 
     def test_filecopy_quoted(self):
         # Check the first path is quoted if it contains spaces
-        c = commands.FileCopyCommand("foo/b a r", "foo/b a z")
-        self.assertEqual('C "foo/b a r" foo/b a z', repr(c))
+        c = commands.FileCopyCommand(b'foo/b a r', b'foo/b a z')
+        self.assertEqual(b'C "foo/b a r" foo/b a z', repr_bytes(c))
 
 
 class TestFileRenameDisplay(TestCase):
 
     def test_filerename(self):
-        c = commands.FileRenameCommand("foo/bar", "foo/baz")
-        self.assertEqual("R foo/bar foo/baz", repr(c))
+        c = commands.FileRenameCommand(b'foo/bar', b'foo/baz')
+        self.assertEqual(b'R foo/bar foo/baz', repr_bytes(c))
 
     def test_filerename_quoted(self):
         # Check the first path is quoted if it contains spaces
-        c = commands.FileRenameCommand("foo/b a r", "foo/b a z")
-        self.assertEqual('R "foo/b a r" foo/b a z', repr(c))
+        c = commands.FileRenameCommand(b'foo/b a r', b'foo/b a z')
+        self.assertEqual(b'R "foo/b a r" foo/b a z', repr_bytes(c))
 
 
 class TestFileDeleteAllDisplay(TestCase):
 
     def test_filedeleteall(self):
         c = commands.FileDeleteAllCommand()
-        self.assertEqual("deleteall", repr(c))
+        self.assertEqual(b'deleteall', repr_bytes(c))
 
 class TestNotesDisplay(TestCase):
 
     def test_noteonly(self):
-        c = commands.NoteModifyCommand('foo', "A basic note")
-        self.assertEqual('N inline :foo\ndata 12\nA basic note', repr(c))
+        c = commands.NoteModifyCommand(b'foo', b'A basic note')
+        self.assertEqual(b'N inline :foo\ndata 12\nA basic note', repr_bytes(c))
 
     def test_notecommit(self):
-        committer = ("Ed Mund", 'ed@example.org', 1234565432, 0)
+        committer = (b'Ed Mund', b'ed@example.org', 1234565432, 0)
 
         commits = [
             commands.CommitCommand(
-                ref='refs/heads/master',
-                mark='1',
+                ref=b'refs/heads/master',
+                mark=b'1',
                 author=committer,
                 committer=committer,
-                message="test\n",
+                message=b'test\n',
                 from_=None,
                 merges=[],
                 file_iter=[
-                    commands.FileModifyCommand('bar', 0100644, None, '')
+                    commands.FileModifyCommand(b'bar', 0o100644, None, b'')
                 ]),
             commands.CommitCommand(
-                ref='refs/notes/commits',
+                ref=b'refs/notes/commits',
                 mark=None,
                 author=None,
                 committer=committer,
-                message="Notes added by 'git notes add'\n",
+                message=b"Notes added by 'git notes add'\n",
                 from_=None,
                 merges=[],
                 file_iter=[
-                    commands.NoteModifyCommand('1', "Test note\n")
+                    commands.NoteModifyCommand(b'1', b'Test note\n')
                 ]),
             commands.CommitCommand(
-                ref='refs/notes/test',
+                ref=b'refs/notes/test',
                 mark=None,
                 author=None,
                 committer=committer,
-                message="Notes added by 'git notes add'\n",
+                message=b"Notes added by 'git notes add'\n",
                 from_=None,
                 merges=[],
                 file_iter=[
-                    commands.NoteModifyCommand('1', "Test test\n")
+                    commands.NoteModifyCommand(b'1', b'Test test\n')
                 ])
         ]
 
         self.assertEqual(
-            """commit refs/heads/master
+            b"""commit refs/heads/master
 mark :1
 author %(user)s
 committer %(user)s
@@ -415,30 +424,30 @@ N inline :1
 data 10
 Test test
 """ % {
-    'user': '%s <%s> %d %+05d' % committer,
-}, ''.join(map(repr, commits)))
+    b'user': b'%s <%s> %d %+05d' % committer,
+}, b''.join([repr_bytes(s) for s in commits]))
 
 
 class TestPathChecking(TestCase):
 
     def test_filemodify_path_checking(self):
-        self.assertRaises(ValueError, commands.FileModifyCommand, "",
-            0100644, None, "text")
+        self.assertRaises(ValueError, commands.FileModifyCommand, b'',
+            0o100644, None, b'text')
         self.assertRaises(ValueError, commands.FileModifyCommand, None,
-            0100644, None, "text")
+            0o100644, None, b'text')
 
     def test_filedelete_path_checking(self):
-        self.assertRaises(ValueError, commands.FileDeleteCommand, "")
+        self.assertRaises(ValueError, commands.FileDeleteCommand, b'')
         self.assertRaises(ValueError, commands.FileDeleteCommand, None)
 
     def test_filerename_path_checking(self):
-        self.assertRaises(ValueError, commands.FileRenameCommand, "", "foo")
-        self.assertRaises(ValueError, commands.FileRenameCommand, None, "foo")
-        self.assertRaises(ValueError, commands.FileRenameCommand, "foo", "")
-        self.assertRaises(ValueError, commands.FileRenameCommand, "foo", None)
+        self.assertRaises(ValueError, commands.FileRenameCommand, b'', b'foo')
+        self.assertRaises(ValueError, commands.FileRenameCommand, None, b'foo')
+        self.assertRaises(ValueError, commands.FileRenameCommand, b'foo', b'')
+        self.assertRaises(ValueError, commands.FileRenameCommand, b'foo', None)
 
     def test_filecopy_path_checking(self):
-        self.assertRaises(ValueError, commands.FileCopyCommand, "", "foo")
-        self.assertRaises(ValueError, commands.FileCopyCommand, None, "foo")
-        self.assertRaises(ValueError, commands.FileCopyCommand, "foo", "")
-        self.assertRaises(ValueError, commands.FileCopyCommand, "foo", None)
+        self.assertRaises(ValueError, commands.FileCopyCommand, b'', b'foo')
+        self.assertRaises(ValueError, commands.FileCopyCommand, None, b'foo')
+        self.assertRaises(ValueError, commands.FileCopyCommand, b'foo', b'')
+        self.assertRaises(ValueError, commands.FileCopyCommand, b'foo', None)
diff --git a/fastimport/tests/test_dates.py b/fastimport/tests/test_dates.py
index f893da9..f1ccd67 100644
--- a/fastimport/tests/test_dates.py
+++ b/fastimport/tests/test_dates.py
@@ -24,11 +24,11 @@ from fastimport import (
 class ParseTzTests(TestCase):
 
     def test_parse_tz_utc(self):
-        self.assertEquals(0, dates.parse_tz("+0000"))
-        self.assertEquals(0, dates.parse_tz("-0000"))
+        self.assertEqual(0, dates.parse_tz(b'+0000'))
+        self.assertEqual(0, dates.parse_tz(b'-0000'))
 
     def test_parse_tz_cet(self):
-        self.assertEquals(3600, dates.parse_tz("+0100"))
+        self.assertEqual(3600, dates.parse_tz(b'+0100'))
 
     def test_parse_tz_odd(self):
-        self.assertEquals(1864800, dates.parse_tz("+51800"))
+        self.assertEqual(1864800, dates.parse_tz(b'+51800'))
diff --git a/fastimport/tests/test_errors.py b/fastimport/tests/test_errors.py
index ef87b05..4fc7dcd 100644
--- a/fastimport/tests/test_errors.py
+++ b/fastimport/tests/test_errors.py
@@ -14,7 +14,6 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 """Test the Import errors"""
-
 from unittest import TestCase
 
 from fastimport import (
diff --git a/fastimport/tests/test_filter_processor.py b/fastimport/tests/test_filter_processor.py
index 74847ad..809bdc8 100644
--- a/fastimport/tests/test_filter_processor.py
+++ b/fastimport/tests/test_filter_processor.py
@@ -14,8 +14,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 """Test FilterProcessor"""
-
-from cStringIO import StringIO
+from io import BytesIO
 
 from unittest import TestCase
 
@@ -30,7 +29,7 @@ from fastimport.processors import (
 
 # A sample input stream containing all (top level) import commands
 _SAMPLE_ALL = \
-"""blob
+b"""blob
 mark :1
 data 4
 foo
@@ -58,7 +57,7 @@ release v0.1
 #  doc/README.txt
 #  doc/index.txt
 _SAMPLE_WITH_DIR = \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -104,16 +103,16 @@ M 644 :4 doc/index.txt
 
 class TestCaseWithFiltering(TestCase):
 
-    def assertFiltering(self, input, params, expected):
-        outf = StringIO()
+    def assertFiltering(self, input_stream, params, expected):
+        outf = BytesIO()
         proc = filter_processor.FilterProcessor(
             params=params)
         proc.outf = outf
-        s = StringIO(input)
+        s = BytesIO(input_stream)
         p = parser.ImportParser(s)
         proc.process(p.iter_commands)
         out = outf.getvalue()
-        self.assertEquals(expected, out)
+        self.assertEqual(expected, out)
 
 class TestNoFiltering(TestCaseWithFiltering):
 
@@ -121,7 +120,7 @@ class TestNoFiltering(TestCaseWithFiltering):
         self.assertFiltering(_SAMPLE_ALL, None, _SAMPLE_ALL)
 
     def test_params_are_none(self):
-        params = {'include_paths': None, 'exclude_paths': None}
+        params = {b'include_paths': None, b'exclude_paths': None}
         self.assertFiltering(_SAMPLE_ALL, params, _SAMPLE_ALL)
 
 
@@ -131,9 +130,9 @@ class TestIncludePaths(TestCaseWithFiltering):
         # Things to note:
         # * only referenced blobs are retained
         # * from clause is dropped from the first command
-        params = {'include_paths': ['NEWS']}
+        params = {b'include_paths': [b'NEWS']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :2
 data 17
 Life
@@ -152,9 +151,9 @@ M 644 :2 NEWS
         #  Additional things to note:
         # * new root: path is now index.txt, not doc/index.txt
         # * other files changed in matching commits are excluded
-        params = {'include_paths': ['doc/index.txt']}
+        params = {b'include_paths': [b'doc/index.txt']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :4
 data 11
 == Docs ==
@@ -170,9 +169,9 @@ M 644 :4 index.txt
     def test_file_with_changes(self):
         #  Additional things to note:
         # * from updated to reference parents in the output
-        params = {'include_paths': ['doc/README.txt']}
+        params = {b'include_paths': [b'doc/README.txt']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -198,9 +197,9 @@ M 644 :3 README.txt
 """)
 
     def test_subdir(self):
-        params = {'include_paths': ['doc/']}
+        params = {b'include_paths': [b'doc/']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -232,9 +231,9 @@ M 644 :4 index.txt
 
     def test_multiple_files_in_subdir(self):
         # The new root should be the subdrectory
-        params = {'include_paths': ['doc/README.txt', 'doc/index.txt']}
+        params = {b'include_paths': [b'doc/README.txt', b'doc/index.txt']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -268,9 +267,9 @@ M 644 :4 index.txt
 class TestExcludePaths(TestCaseWithFiltering):
 
     def test_file_in_root(self):
-        params = {'exclude_paths': ['NEWS']}
+        params = {b'exclude_paths': [b'NEWS']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -301,9 +300,9 @@ M 644 :4 doc/index.txt
 """)
 
     def test_file_in_subdir(self):
-        params = {'exclude_paths': ['doc/README.txt']}
+        params = {b'exclude_paths': [b'doc/README.txt']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :2
 data 17
 Life
@@ -331,9 +330,9 @@ M 644 :4 doc/index.txt
 """)
 
     def test_subdir(self):
-        params = {'exclude_paths': ['doc/']}
+        params = {b'exclude_paths': [b'doc/']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :2
 data 17
 Life
@@ -349,9 +348,9 @@ M 644 :2 NEWS
 """)
 
     def test_multple_files(self):
-        params = {'exclude_paths': ['doc/index.txt', 'NEWS']}
+        params = {b'exclude_paths': [b'doc/index.txt', b'NEWS']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -380,9 +379,9 @@ M 644 :3 doc/README.txt
 class TestIncludeAndExcludePaths(TestCaseWithFiltering):
 
     def test_included_dir_and_excluded_file(self):
-        params = {'include_paths': ['doc/'], 'exclude_paths': ['doc/index.txt']}
+        params = {b'include_paths': [b'doc/'], b'exclude_paths': [b'doc/index.txt']}
         self.assertFiltering(_SAMPLE_WITH_DIR, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -416,7 +415,7 @@ M 644 :3 README.txt
 #
 # It then renames doc/README.txt => doc/README
 _SAMPLE_WITH_RENAME_INSIDE = _SAMPLE_WITH_DIR + \
-"""commit refs/heads/master
+b"""commit refs/heads/master
 mark :103
 committer d <b@c> 1234798653 +0000
 data 10
@@ -433,7 +432,7 @@ R doc/README.txt doc/README
 #
 # It then renames doc/README.txt => README
 _SAMPLE_WITH_RENAME_TO_OUTSIDE = _SAMPLE_WITH_DIR + \
-"""commit refs/heads/master
+b"""commit refs/heads/master
 mark :103
 committer d <b@c> 1234798653 +0000
 data 10
@@ -450,7 +449,7 @@ R doc/README.txt README
 #
 # It then renames NEWS => doc/NEWS
 _SAMPLE_WITH_RENAME_TO_INSIDE = _SAMPLE_WITH_DIR + \
-"""commit refs/heads/master
+b"""commit refs/heads/master
 mark :103
 committer d <b@c> 1234798653 +0000
 data 10
@@ -463,9 +462,9 @@ class TestIncludePathsWithRenames(TestCaseWithFiltering):
 
     def test_rename_all_inside(self):
         # These rename commands ought to be kept but adjusted for the new root
-        params = {'include_paths': ['doc/']}
+        params = {b'include_paths': [b'doc/']}
         self.assertFiltering(_SAMPLE_WITH_RENAME_INSIDE, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -504,9 +503,9 @@ R README.txt README
 
     def test_rename_to_outside(self):
         # These rename commands become deletes
-        params = {'include_paths': ['doc/']}
+        params = {b'include_paths': [b'doc/']}
         self.assertFiltering(_SAMPLE_WITH_RENAME_TO_OUTSIDE, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -545,9 +544,9 @@ D README.txt
 
     def test_rename_to_inside(self):
         # This ought to create a new file but doesn't yet
-        params = {'include_paths': ['doc/']}
+        params = {b'include_paths': [b'doc/']}
         self.assertFiltering(_SAMPLE_WITH_RENAME_TO_INSIDE, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -586,7 +585,7 @@ M 644 :4 index.txt
 #
 # It then copies doc/README.txt => doc/README
 _SAMPLE_WITH_COPY_INSIDE = _SAMPLE_WITH_DIR + \
-"""commit refs/heads/master
+b"""commit refs/heads/master
 mark :103
 committer d <b@c> 1234798653 +0000
 data 10
@@ -603,7 +602,7 @@ C doc/README.txt doc/README
 #
 # It then copies doc/README.txt => README
 _SAMPLE_WITH_COPY_TO_OUTSIDE = _SAMPLE_WITH_DIR + \
-"""commit refs/heads/master
+b"""commit refs/heads/master
 mark :103
 committer d <b@c> 1234798653 +0000
 data 10
@@ -620,7 +619,7 @@ C doc/README.txt README
 #
 # It then copies NEWS => doc/NEWS
 _SAMPLE_WITH_COPY_TO_INSIDE = _SAMPLE_WITH_DIR + \
-"""commit refs/heads/master
+b"""commit refs/heads/master
 mark :103
 committer d <b@c> 1234798653 +0000
 data 10
@@ -634,9 +633,9 @@ class TestIncludePathsWithCopies(TestCaseWithFiltering):
 
     def test_copy_all_inside(self):
         # These copy commands ought to be kept but adjusted for the new root
-        params = {'include_paths': ['doc/']}
+        params = {b'include_paths': [b'doc/']}
         self.assertFiltering(_SAMPLE_WITH_COPY_INSIDE, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -675,9 +674,9 @@ C README.txt README
 
     def test_copy_to_outside(self):
         # This can be ignored
-        params = {'include_paths': ['doc/']}
+        params = {b'include_paths': [b'doc/']}
         self.assertFiltering(_SAMPLE_WITH_COPY_TO_OUTSIDE, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -709,9 +708,9 @@ M 644 :4 index.txt
 
     def test_copy_to_inside(self):
         # This ought to create a new file but doesn't yet
-        params = {'include_paths': ['doc/']}
+        params = {b'include_paths': [b'doc/']}
         self.assertFiltering(_SAMPLE_WITH_COPY_TO_INSIDE, params, \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -748,7 +747,7 @@ M 644 :4 index.txt
 #  doc/README.txt
 #  doc/index.txt
 _SAMPLE_WITH_DELETEALL = \
-"""blob
+b"""blob
 mark :1
 data 9
 Welcome!
@@ -784,9 +783,9 @@ M 644 :4 doc/index.txt
 class TestIncludePathsWithDeleteAll(TestCaseWithFiltering):
 
     def test_deleteall(self):
-        params = {'include_paths': ['doc/index.txt']}
+        params = {b'include_paths': [b'doc/index.txt']}
         self.assertFiltering(_SAMPLE_WITH_DELETEALL, params, \
-"""blob
+b"""blob
 mark :4
 data 11
 == Docs ==
@@ -803,7 +802,7 @@ M 644 :4 index.txt
 
 
 _SAMPLE_WITH_TAGS = _SAMPLE_WITH_DIR + \
-"""tag v0.1
+b"""tag v0.1
 from :100
 tagger d <b@c> 1234798653 +0000
 data 12
@@ -821,9 +820,9 @@ class TestIncludePathsWithTags(TestCaseWithFiltering):
         # If a tag references a commit with a parent we kept,
         # keep the tag but adjust 'from' accordingly.
         # Otherwise, delete the tag command.
-        params = {'include_paths': ['NEWS']}
+        params = {b'include_paths': [b'NEWS']}
         self.assertFiltering(_SAMPLE_WITH_TAGS, params, \
-"""blob
+b"""blob
 mark :2
 data 17
 Life
@@ -845,7 +844,7 @@ release v0.2
 
 
 _SAMPLE_WITH_RESETS = _SAMPLE_WITH_DIR + \
-"""reset refs/heads/foo
+b"""reset refs/heads/foo
 reset refs/heads/bar
 from :102
 """
@@ -856,9 +855,9 @@ class TestIncludePathsWithResets(TestCaseWithFiltering):
         # Resets init'ing a branch (without a from) are passed through.
         # If a reset references a commit with a parent we kept,
         # keep the reset but adjust 'from' accordingly.
-        params = {'include_paths': ['NEWS']}
+        params = {b'include_paths': [b'NEWS']}
         self.assertFiltering(_SAMPLE_WITH_RESETS, params, \
-"""blob
+b"""blob
 mark :2
 data 17
 Life
@@ -879,7 +878,7 @@ from :101
 
 # A sample input stream containing empty commit
 _SAMPLE_EMPTY_COMMIT = \
-"""blob
+b"""blob
 mark :1
 data 4
 foo
@@ -898,7 +897,7 @@ empty commit
 
 # A sample input stream containing unresolved from and merge references
 _SAMPLE_FROM_MERGE_COMMIT = \
-"""blob
+b"""blob
 mark :1
 data 4
 foo
@@ -935,11 +934,11 @@ M 644 :99 data/DATA2
 """
 
 class TestSquashEmptyCommitsFlag(TestCaseWithFiltering):
-    
+
     def test_squash_empty_commit(self):
-        params = {'include_paths': None, 'exclude_paths': None}
+        params = {b'include_paths': None, b'exclude_paths': None}
         self.assertFiltering(_SAMPLE_EMPTY_COMMIT, params, \
-"""blob
+b"""blob
 mark :1
 data 4
 foo
@@ -952,13 +951,13 @@ M 644 :1 COPYING
 """)
 
     def test_keep_empty_commit(self):
-        params = {'include_paths': None, 'exclude_paths': None, 'squash_empty_commits': False}
+        params = {b'include_paths': None, b'exclude_paths': None, b'squash_empty_commits': False}
         self.assertFiltering(_SAMPLE_EMPTY_COMMIT, params, _SAMPLE_EMPTY_COMMIT)
 
     def test_squash_unresolved_references(self):
-        params = {'include_paths': None, 'exclude_paths': None}
+        params = {b'include_paths': None, b'exclude_paths': None}
         self.assertFiltering(_SAMPLE_FROM_MERGE_COMMIT, params, \
-"""blob
+b"""blob
 mark :1
 data 4
 foo
@@ -995,15 +994,15 @@ M 644 :99 data/DATA2
 """)
 
     def test_keep_unresolved_from_and_merge(self):
-        params = {'include_paths': None, 'exclude_paths': None, 'squash_empty_commits': False}
+        params = {b'include_paths': None, b'exclude_paths': None, b'squash_empty_commits': False}
         self.assertFiltering(_SAMPLE_FROM_MERGE_COMMIT, params, _SAMPLE_FROM_MERGE_COMMIT)
 
     def test_with_excludes(self):
-        params = {'include_paths': None,
-                  'exclude_paths': ['data/DATA'],
-                  'squash_empty_commits': False}
+        params = {b'include_paths': None,
+                  b'exclude_paths': [b'data/DATA'],
+                  b'squash_empty_commits': False}
         self.assertFiltering(_SAMPLE_FROM_MERGE_COMMIT, params, \
-"""blob
+b"""blob
 mark :1
 data 4
 foo
@@ -1035,11 +1034,11 @@ M 644 :99 data/DATA2
 """)
 
     def test_with_file_includes(self):
-        params = {'include_paths': ['COPYING', 'data/DATA2'],
-                  'exclude_paths': None,
-                  'squash_empty_commits': False}
+        params = {b'include_paths': [b'COPYING', b'data/DATA2'],
+                  b'exclude_paths': None,
+                  b'squash_empty_commits': False}
         self.assertFiltering(_SAMPLE_FROM_MERGE_COMMIT, params, \
-"""blob
+b"""blob
 mark :1
 data 4
 foo
@@ -1070,13 +1069,13 @@ merge :1001
 M 644 :99 data/DATA2
 """
 )
-        
+
     def test_with_directory_includes(self):
-        params = {'include_paths': ['data/'],
-                  'exclude_paths': None,
-                  'squash_empty_commits': False}
+        params = {b'include_paths': [b'data/'],
+                  b'exclude_paths': None,
+                  b'squash_empty_commits': False}
         self.assertFiltering(_SAMPLE_FROM_MERGE_COMMIT, params, \
-"""commit refs/heads/master
+b"""commit refs/heads/master
 mark :3
 committer Joe <joe@example.com> 1234567890 +1000
 data 6
diff --git a/fastimport/tests/test_helpers.py b/fastimport/tests/test_helpers.py
index 4f4b9f0..3198cea 100644
--- a/fastimport/tests/test_helpers.py
+++ b/fastimport/tests/test_helpers.py
@@ -31,25 +31,25 @@ class TestCommonDirectory(unittest.TestCase):
         self.assertEqual(c, None)
 
     def test_one_path(self):
-        c = helpers.common_directory(['foo'])
-        self.assertEqual(c, '')
-        c = helpers.common_directory(['foo/'])
-        self.assertEqual(c, 'foo/')
-        c = helpers.common_directory(['foo/bar'])
-        self.assertEqual(c, 'foo/')
+        c = helpers.common_directory([b'foo'])
+        self.assertEqual(c, b'')
+        c = helpers.common_directory([b'foo/'])
+        self.assertEqual(c, b'foo/')
+        c = helpers.common_directory([b'foo/bar'])
+        self.assertEqual(c, b'foo/')
 
     def test_two_paths(self):
-        c = helpers.common_directory(['foo', 'bar'])
-        self.assertEqual(c, '')
-        c = helpers.common_directory(['foo/', 'bar'])
-        self.assertEqual(c, '')
-        c = helpers.common_directory(['foo/', 'foo/bar'])
-        self.assertEqual(c, 'foo/')
-        c = helpers.common_directory(['foo/bar/x', 'foo/bar/y'])
-        self.assertEqual(c, 'foo/bar/')
-        c = helpers.common_directory(['foo/bar/aa_x', 'foo/bar/aa_y'])
-        self.assertEqual(c, 'foo/bar/')
+        c = helpers.common_directory([b'foo', b'bar'])
+        self.assertEqual(c, b'')
+        c = helpers.common_directory([b'foo/', b'bar'])
+        self.assertEqual(c, b'')
+        c = helpers.common_directory([b'foo/', b'foo/bar'])
+        self.assertEqual(c, b'foo/')
+        c = helpers.common_directory([b'foo/bar/x', b'foo/bar/y'])
+        self.assertEqual(c, b'foo/bar/')
+        c = helpers.common_directory([b'foo/bar/aa_x', b'foo/bar/aa_y'])
+        self.assertEqual(c, b'foo/bar/')
 
     def test_lots_of_paths(self):
-        c = helpers.common_directory(['foo/bar/x', 'foo/bar/y', 'foo/bar/z'])
-        self.assertEqual(c, 'foo/bar/')
+        c = helpers.common_directory([b'foo/bar/x', b'foo/bar/y', b'foo/bar/z'])
+        self.assertEqual(c, b'foo/bar/')
diff --git a/fastimport/tests/test_parser.py b/fastimport/tests/test_parser.py
index ff0b8e1..5084dc9 100644
--- a/fastimport/tests/test_parser.py
+++ b/fastimport/tests/test_parser.py
@@ -14,8 +14,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 """Test the Import parsing"""
-
-import StringIO
+import io
 import time
 import unittest
 
@@ -29,43 +28,43 @@ from fastimport import (
 class TestLineBasedParser(unittest.TestCase):
 
     def test_push_line(self):
-        s = StringIO.StringIO("foo\nbar\nbaz\n")
+        s = io.BytesIO(b"foo\nbar\nbaz\n")
         p = parser.LineBasedParser(s)
-        self.assertEqual('foo', p.next_line())
-        self.assertEqual('bar', p.next_line())
-        p.push_line('bar')
-        self.assertEqual('bar', p.next_line())
-        self.assertEqual('baz', p.next_line())
+        self.assertEqual(b'foo', p.next_line())
+        self.assertEqual(b'bar', p.next_line())
+        p.push_line(b'bar')
+        self.assertEqual(b'bar', p.next_line())
+        self.assertEqual(b'baz', p.next_line())
         self.assertEqual(None, p.next_line())
 
     def test_read_bytes(self):
-        s = StringIO.StringIO("foo\nbar\nbaz\n")
+        s = io.BytesIO(b"foo\nbar\nbaz\n")
         p = parser.LineBasedParser(s)
-        self.assertEqual('fo', p.read_bytes(2))
-        self.assertEqual('o\nb', p.read_bytes(3))
-        self.assertEqual('ar', p.next_line())
+        self.assertEqual(b'fo', p.read_bytes(2))
+        self.assertEqual(b'o\nb', p.read_bytes(3))
+        self.assertEqual(b'ar', p.next_line())
         # Test that the line buffer is ignored
-        p.push_line('bar')
-        self.assertEqual('baz', p.read_bytes(3))
+        p.push_line(b'bar')
+        self.assertEqual(b'baz', p.read_bytes(3))
         # Test missing bytes
         self.assertRaises(errors.MissingBytes, p.read_bytes, 10)
 
     def test_read_until(self):
         # TODO
         return
-        s = StringIO.StringIO("foo\nbar\nbaz\nabc\ndef\nghi\n")
+        s = io.BytesIO(b"foo\nbar\nbaz\nabc\ndef\nghi\n")
         p = parser.LineBasedParser(s)
-        self.assertEqual('foo\nbar', p.read_until('baz'))
-        self.assertEqual('abc', p.next_line())
+        self.assertEqual(b'foo\nbar', p.read_until(b'baz'))
+        self.assertEqual(b'abc', p.next_line())
         # Test that the line buffer is ignored
-        p.push_line('abc')
-        self.assertEqual('def', p.read_until('ghi'))
+        p.push_line(b'abc')
+        self.assertEqual(b'def', p.read_until(b'ghi'))
         # Test missing terminator
-        self.assertRaises(errors.MissingTerminator, p.read_until('>>>'))
+        self.assertRaises(errors.MissingTerminator, p.read_until(b'>>>'))
 
 
 # Sample text
-_sample_import_text = """
+_sample_import_text = b"""
 progress completed
 # Test blob formats
 blob
@@ -153,202 +152,202 @@ class TestImportParser(unittest.TestCase):
         del self.fake_time
 
     def test_iter_commands(self):
-        s = StringIO.StringIO(_sample_import_text)
+        s = io.BytesIO(_sample_import_text)
         p = parser.ImportParser(s)
         result = []
         for cmd in p.iter_commands():
             result.append(cmd)
-            if cmd.name == 'commit':
+            if cmd.name == b'commit':
                 for fc in cmd.iter_files():
                     result.append(fc)
 
         self.assertEqual(len(result), 17)
         cmd1 = result.pop(0)
-        self.assertEqual('progress', cmd1.name)
-        self.assertEqual('completed', cmd1.message)
+        self.assertEqual(b'progress', cmd1.name)
+        self.assertEqual(b'completed', cmd1.message)
         cmd2 = result.pop(0)
-        self.assertEqual('blob', cmd2.name)
-        self.assertEqual('1', cmd2.mark)
-        self.assertEqual(':1', cmd2.id)
-        self.assertEqual('aaaa', cmd2.data)
+        self.assertEqual(b'blob', cmd2.name)
+        self.assertEqual(b'1', cmd2.mark)
+        self.assertEqual(b':1', cmd2.id)
+        self.assertEqual(b'aaaa', cmd2.data)
         self.assertEqual(4, cmd2.lineno)
         cmd3 = result.pop(0)
-        self.assertEqual('blob', cmd3.name)
-        self.assertEqual('@7', cmd3.id)
+        self.assertEqual(b'blob', cmd3.name)
+        self.assertEqual(b'@7', cmd3.id)
         self.assertEqual(None, cmd3.mark)
-        self.assertEqual('bbbbb', cmd3.data)
+        self.assertEqual(b'bbbbb', cmd3.data)
         self.assertEqual(7, cmd3.lineno)
         cmd4 = result.pop(0)
-        self.assertEqual('commit', cmd4.name)
-        self.assertEqual('2', cmd4.mark)
-        self.assertEqual(':2', cmd4.id)
-        self.assertEqual('initial import', cmd4.message)
+        self.assertEqual(b'commit', cmd4.name)
+        self.assertEqual(b'2', cmd4.mark)
+        self.assertEqual(b':2', cmd4.id)
+        self.assertEqual(b'initial import', cmd4.message)
 
-        self.assertEqual(('bugs bunny', 'bugs@bunny.org', self.fake_time, 0), cmd4.committer)
+        self.assertEqual((b'bugs bunny', b'bugs@bunny.org', self.fake_time, 0), cmd4.committer)
         # namedtuple attributes
-        self.assertEqual('bugs bunny', cmd4.committer.name)
-        self.assertEqual('bugs@bunny.org', cmd4.committer.email)
+        self.assertEqual(b'bugs bunny', cmd4.committer.name)
+        self.assertEqual(b'bugs@bunny.org', cmd4.committer.email)
         self.assertEqual(self.fake_time, cmd4.committer.timestamp)
         self.assertEqual(0, cmd4.committer.timezone)
 
         self.assertEqual(None, cmd4.author)
         self.assertEqual(11, cmd4.lineno)
-        self.assertEqual('refs/heads/master', cmd4.ref)
+        self.assertEqual(b'refs/heads/master', cmd4.ref)
         self.assertEqual(None, cmd4.from_)
         self.assertEqual([], cmd4.merges)
         file_cmd1 = result.pop(0)
-        self.assertEqual('filemodify', file_cmd1.name)
-        self.assertEqual('README', file_cmd1.path)
-        self.assertEqual(0100644, file_cmd1.mode)
-        self.assertEqual('Welcome from bugs\n', file_cmd1.data)
+        self.assertEqual(b'filemodify', file_cmd1.name)
+        self.assertEqual(b'README', file_cmd1.path)
+        self.assertEqual(0o100644, file_cmd1.mode)
+        self.assertEqual(b'Welcome from bugs\n', file_cmd1.data)
         cmd5 = result.pop(0)
-        self.assertEqual('commit', cmd5.name)
+        self.assertEqual(b'commit', cmd5.name)
         self.assertEqual(None, cmd5.mark)
-        self.assertEqual('@19', cmd5.id)
-        self.assertEqual('second commit', cmd5.message)
-        self.assertEqual(('', 'bugs@bunny.org', self.fake_time, 0), cmd5.committer)
+        self.assertEqual(b'@19', cmd5.id)
+        self.assertEqual(b'second commit', cmd5.message)
+        self.assertEqual((b'', b'bugs@bunny.org', self.fake_time, 0), cmd5.committer)
         self.assertEqual(None, cmd5.author)
         self.assertEqual(19, cmd5.lineno)
-        self.assertEqual('refs/heads/master', cmd5.ref)
-        self.assertEqual(':2', cmd5.from_)
+        self.assertEqual(b'refs/heads/master', cmd5.ref)
+        self.assertEqual(b':2', cmd5.from_)
         self.assertEqual([], cmd5.merges)
         file_cmd2 = result.pop(0)
-        self.assertEqual('filemodify', file_cmd2.name)
-        self.assertEqual('README', file_cmd2.path)
-        self.assertEqual(0100644, file_cmd2.mode)
-        self.assertEqual('Welcome from bugs, etc.', file_cmd2.data)
+        self.assertEqual(b'filemodify', file_cmd2.name)
+        self.assertEqual(b'README', file_cmd2.path)
+        self.assertEqual(0o100644, file_cmd2.mode)
+        self.assertEqual(b'Welcome from bugs, etc.', file_cmd2.data)
         cmd6 = result.pop(0)
-        self.assertEqual(cmd6.name, 'checkpoint')
+        self.assertEqual(cmd6.name, b'checkpoint')
         cmd7 = result.pop(0)
-        self.assertEqual('progress', cmd7.name)
-        self.assertEqual('completed', cmd7.message)
+        self.assertEqual(b'progress', cmd7.name)
+        self.assertEqual(b'completed', cmd7.message)
         cmd = result.pop(0)
-        self.assertEqual('commit', cmd.name)
-        self.assertEqual('3', cmd.mark)
+        self.assertEqual(b'commit', cmd.name)
+        self.assertEqual(b'3', cmd.mark)
         self.assertEqual(None, cmd.from_)
         cmd = result.pop(0)
-        self.assertEqual('commit', cmd.name)
-        self.assertEqual('4', cmd.mark)
-        self.assertEqual('Commit with heredoc-style message\n', cmd.message)
+        self.assertEqual(b'commit', cmd.name)
+        self.assertEqual(b'4', cmd.mark)
+        self.assertEqual(b'Commit with heredoc-style message\n', cmd.message)
         cmd = result.pop(0)
-        self.assertEqual('commit', cmd.name)
-        self.assertEqual('5', cmd.mark)
-        self.assertEqual('submodule test\n', cmd.message)
+        self.assertEqual(b'commit', cmd.name)
+        self.assertEqual(b'5', cmd.mark)
+        self.assertEqual(b'submodule test\n', cmd.message)
         file_cmd1 = result.pop(0)
-        self.assertEqual('filemodify', file_cmd1.name)
-        self.assertEqual('tree-id', file_cmd1.path)
-        self.assertEqual(0160000, file_cmd1.mode)
-        self.assertEqual("rev-id", file_cmd1.dataref)
+        self.assertEqual(b'filemodify', file_cmd1.name)
+        self.assertEqual(b'tree-id', file_cmd1.path)
+        self.assertEqual(0o160000, file_cmd1.mode)
+        self.assertEqual(b"rev-id", file_cmd1.dataref)
         cmd = result.pop(0)
-        self.assertEqual('feature', cmd.name)
-        self.assertEqual('whatever', cmd.feature_name)
+        self.assertEqual(b'feature', cmd.name)
+        self.assertEqual(b'whatever', cmd.feature_name)
         self.assertEqual(None, cmd.value)
         cmd = result.pop(0)
-        self.assertEqual('feature', cmd.name)
-        self.assertEqual('foo', cmd.feature_name)
-        self.assertEqual('bar', cmd.value)
+        self.assertEqual(b'feature', cmd.name)
+        self.assertEqual(b'foo', cmd.feature_name)
+        self.assertEqual(b'bar', cmd.value)
         cmd = result.pop(0)
-        self.assertEqual('commit', cmd.name)
-        self.assertEqual('6', cmd.mark)
-        self.assertEqual('test of properties', cmd.message)
+        self.assertEqual(b'commit', cmd.name)
+        self.assertEqual(b'6', cmd.mark)
+        self.assertEqual(b'test of properties', cmd.message)
         self.assertEqual({
-            'p1': None,
-            'p2': u'hohum',
-            'p3': u'alpha\nbeta\ngamma',
-            'p4': u'whatever',
-            }, cmd.properties)
+            b'p1': None,
+            b'p2': b'hohum',
+            b'p3': b'alpha\nbeta\ngamma',
+            b'p4': b'whatever',
+        }, cmd.properties)
         cmd = result.pop(0)
-        self.assertEqual('commit', cmd.name)
-        self.assertEqual('7', cmd.mark)
-        self.assertEqual('multi-author test', cmd.message)
-        self.assertEqual('', cmd.committer[0])
-        self.assertEqual('bugs@bunny.org', cmd.committer[1])
-        self.assertEqual('Fluffy', cmd.author[0])
-        self.assertEqual('fluffy@bunny.org', cmd.author[1])
-        self.assertEqual('Daffy', cmd.more_authors[0][0])
-        self.assertEqual('daffy@duck.org', cmd.more_authors[0][1])
-        self.assertEqual('Donald', cmd.more_authors[1][0])
-        self.assertEqual('donald@duck.org', cmd.more_authors[1][1])
+        self.assertEqual(b'commit', cmd.name)
+        self.assertEqual(b'7', cmd.mark)
+        self.assertEqual(b'multi-author test', cmd.message)
+        self.assertEqual(b'', cmd.committer[0])
+        self.assertEqual(b'bugs@bunny.org', cmd.committer[1])
+        self.assertEqual(b'Fluffy', cmd.author[0])
+        self.assertEqual(b'fluffy@bunny.org', cmd.author[1])
+        self.assertEqual(b'Daffy', cmd.more_authors[0][0])
+        self.assertEqual(b'daffy@duck.org', cmd.more_authors[0][1])
+        self.assertEqual(b'Donald', cmd.more_authors[1][0])
+        self.assertEqual(b'donald@duck.org', cmd.more_authors[1][1])
 
     def test_done_feature_missing_done(self):
-        s = StringIO.StringIO("""feature done
+        s = io.BytesIO(b"""feature done
 """)
         p = parser.ImportParser(s)
         cmds = p.iter_commands()
-        self.assertEquals("feature", cmds.next().name)
-        self.assertRaises(errors.PrematureEndOfStream, cmds.next)
+        self.assertEqual(b"feature", next(cmds).name)
+        self.assertRaises(errors.PrematureEndOfStream, lambda: next(cmds))
 
     def test_done_with_feature(self):
-        s = StringIO.StringIO("""feature done
+        s = io.BytesIO(b"""feature done
 done
 more data
 """)
         p = parser.ImportParser(s)
         cmds = p.iter_commands()
-        self.assertEquals("feature", cmds.next().name)
-        self.assertRaises(StopIteration, cmds.next)
+        self.assertEqual(b"feature", next(cmds).name)
+        self.assertRaises(StopIteration, lambda: next(cmds))
 
     def test_done_without_feature(self):
-        s = StringIO.StringIO("""done
+        s = io.BytesIO(b"""done
 more data
 """)
         p = parser.ImportParser(s)
         cmds = p.iter_commands()
-        self.assertEquals([], list(cmds))
+        self.assertEqual([], list(cmds))
 
 
 class TestStringParsing(unittest.TestCase):
 
     def test_unquote(self):
-        s = r'hello \"sweet\" wo\\r\tld'
-        self.assertEquals(r'hello "sweet" wo\r' + "\tld",
+        s = br'hello \"sweet\" wo\\r\tld'
+        self.assertEqual(br'hello "sweet" wo\r' + b'\tld',
             parser._unquote_c_string(s))
 
 
 class TestPathPairParsing(unittest.TestCase):
 
     def test_path_pair_simple(self):
-        p = parser.ImportParser("")
-        self.assertEqual(['foo', 'bar'], p._path_pair("foo bar"))
+        p = parser.ImportParser(b'')
+        self.assertEqual([b'foo', b'bar'], p._path_pair(b'foo bar'))
 
     def test_path_pair_spaces_in_first(self):
         p = parser.ImportParser("")
-        self.assertEqual(['foo bar', 'baz'],
-            p._path_pair('"foo bar" baz'))
+        self.assertEqual([b'foo bar', b'baz'],
+            p._path_pair(b'"foo bar" baz'))
 
 
 class TestTagParsing(unittest.TestCase):
 
     def test_tagger_with_email(self):
-        p = parser.ImportParser(StringIO.StringIO(
-            "tag refs/tags/v1.0\n"
-            "from :xxx\n"
-            "tagger Joe Wong <joe@example.com> 1234567890 -0600\n"
-            "data 11\n"
-            "create v1.0"))
+        p = parser.ImportParser(io.BytesIO(
+            b"tag refs/tags/v1.0\n"
+            b"from :xxx\n"
+            b"tagger Joe Wong <joe@example.com> 1234567890 -0600\n"
+            b"data 11\n"
+            b"create v1.0"))
         cmds = list(p.iter_commands())
-        self.assertEquals(1, len(cmds))
+        self.assertEqual(1, len(cmds))
         self.assertTrue(isinstance(cmds[0], commands.TagCommand))
-        self.assertEquals(cmds[0].tagger,
-            ('Joe Wong', 'joe@example.com', 1234567890.0, -21600))
+        self.assertEqual(cmds[0].tagger,
+            (b'Joe Wong', b'joe@example.com', 1234567890.0, -21600))
 
     def test_tagger_no_email_strict(self):
-        p = parser.ImportParser(StringIO.StringIO(
-            "tag refs/tags/v1.0\n"
-            "from :xxx\n"
-            "tagger Joe Wong\n"
-            "data 11\n"
-            "create v1.0"))
+        p = parser.ImportParser(io.BytesIO(
+            b"tag refs/tags/v1.0\n"
+            b"from :xxx\n"
+            b"tagger Joe Wong\n"
+            b"data 11\n"
+            b"create v1.0"))
         self.assertRaises(errors.BadFormat, list, p.iter_commands())
 
     def test_tagger_no_email_not_strict(self):
-        p = parser.ImportParser(StringIO.StringIO(
-            "tag refs/tags/v1.0\n"
-            "from :xxx\n"
-            "tagger Joe Wong\n"
-            "data 11\n"
-            "create v1.0"), strict=False)
+        p = parser.ImportParser(io.BytesIO(
+            b"tag refs/tags/v1.0\n"
+            b"from :xxx\n"
+            b"tagger Joe Wong\n"
+            b"data 11\n"
+            b"create v1.0"), strict=False)
         cmds = list(p.iter_commands())
-        self.assertEquals(1, len(cmds))
+        self.assertEqual(1, len(cmds))
         self.assertTrue(isinstance(cmds[0], commands.TagCommand))
-        self.assertEquals(cmds[0].tagger[:2], ('Joe Wong', None))
+        self.assertEqual(cmds[0].tagger[:2], (b'Joe Wong', None))
diff --git a/setup.py b/setup.py
index 366aa1d..1e1420c 100755
--- a/setup.py
+++ b/setup.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python
 from distutils.core import setup
 
-version = "0.9.4"
+version = "0.9.5"
 
 setup(name="fastimport",
       description="VCS fastimport/fastexport parser",
author	Jelmer Vernooĳ <jelmer@jelmer.uk>	2016-04-18 17:39:04 +0000
committer	Jelmer Vernooĳ <jelmer@jelmer.uk>	2016-04-18 17:39:04 +0000
commit	9b953dff18b4df3a5cbc9794a3c4bf1430395e74 (patch)
tree	e86bd99e010c50cefe2a11bfcfe2597c5636c123
parent	39137a0f21aca2f403680909a1a1ed79774f58e7 (diff)
parent	3859165aeb1876e47af7826a33b8a16a44e05e75 (diff)
download	python-fastimport-git-9b953dff18b4df3a5cbc9794a3c4bf1430395e74.tar.gz