summaryrefslogtreecommitdiff
path: root/Doc/library/bz2.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/bz2.rst')
-rw-r--r--Doc/library/bz2.rst252
1 files changed, 129 insertions, 123 deletions
diff --git a/Doc/library/bz2.rst b/Doc/library/bz2.rst
index 93144b6638..b79bccdb31 100644
--- a/Doc/library/bz2.rst
+++ b/Doc/library/bz2.rst
@@ -1,201 +1,207 @@
-:mod:`bz2` --- Compression compatible with :program:`bzip2`
-===========================================================
+:mod:`bz2` --- Support for :program:`bzip2` compression
+=======================================================
.. module:: bz2
- :synopsis: Interface to compression and decompression routines
- compatible with bzip2.
+ :synopsis: Interfaces for bzip2 compression and decompression.
.. moduleauthor:: Gustavo Niemeyer <niemeyer@conectiva.com>
+.. moduleauthor:: Nadeem Vawda <nadeem.vawda@gmail.com>
.. sectionauthor:: Gustavo Niemeyer <niemeyer@conectiva.com>
+.. sectionauthor:: Nadeem Vawda <nadeem.vawda@gmail.com>
-This module provides a comprehensive interface for the bz2 compression library.
-It implements a complete file interface, one-shot (de)compression functions, and
-types for sequential (de)compression.
+This module provides a comprehensive interface for compressing and
+decompressing data using the bzip2 compression algorithm.
-Here is a summary of the features offered by the bz2 module:
+The :mod:`bz2` module contains:
-* :class:`BZ2File` class implements a complete file interface, including
- :meth:`~BZ2File.readline`, :meth:`~BZ2File.readlines`,
- :meth:`~BZ2File.writelines`, :meth:`~BZ2File.seek`, etc;
+* The :func:`.open` function and :class:`BZ2File` class for reading and
+ writing compressed files.
+* The :class:`BZ2Compressor` and :class:`BZ2Decompressor` classes for
+ incremental (de)compression.
+* The :func:`compress` and :func:`decompress` functions for one-shot
+ (de)compression.
-* :class:`BZ2File` class implements emulated :meth:`~BZ2File.seek` support;
+All of the classes in this module may safely be accessed from multiple threads.
-* :class:`BZ2File` class implements universal newline support;
-* :class:`BZ2File` class offers an optimized line iteration using a readahead
- algorithm;
+(De)compression of files
+------------------------
-* Sequential (de)compression supported by :class:`BZ2Compressor` and
- :class:`BZ2Decompressor` classes;
+.. function:: open(filename, mode='r', compresslevel=9, encoding=None, errors=None, newline=None)
-* One-shot (de)compression supported by :func:`compress` and :func:`decompress`
- functions;
+ Open a bzip2-compressed file in binary or text mode, returning a :term:`file
+ object`.
-* Thread safety uses individual locking mechanism.
+ As with the constructor for :class:`BZ2File`, the *filename* argument can be
+ an actual filename (a :class:`str` or :class:`bytes` object), or an existing
+ file object to read from or write to.
+ The *mode* argument can be any of ``'r'``, ``'rb'``, ``'w'``, ``'wb'``,
+ ``'a'``, or ``'ab'`` for binary mode, or ``'rt'``, ``'wt'``, or ``'at'`` for
+ text mode. The default is ``'rb'``.
-(De)compression of files
-------------------------
+ The *compresslevel* argument is an integer from 1 to 9, as for the
+ :class:`BZ2File` constructor.
-Handling of compressed files is offered by the :class:`BZ2File` class.
+ For binary mode, this function is equivalent to the :class:`BZ2File`
+ constructor: ``BZ2File(filename, mode, compresslevel=compresslevel)``. In
+ this case, the *encoding*, *errors* and *newline* arguments must not be
+ provided.
+ For text mode, a :class:`BZ2File` object is created, and wrapped in an
+ :class:`io.TextIOWrapper` instance with the specified encoding, error
+ handling behavior, and line ending(s).
-.. index::
- single: universal newlines; bz2.BZ2File class
+ .. versionadded:: 3.3
-.. class:: BZ2File(filename, mode='r', buffering=0, compresslevel=9)
- Open a bz2 file. Mode can be either ``'r'`` or ``'w'``, for reading (default)
- or writing. When opened for writing, the file will be created if it doesn't
- exist, and truncated otherwise. If *buffering* is given, ``0`` means
- unbuffered, and larger numbers specify the buffer size; the default is
- ``0``. If *compresslevel* is given, it must be a number between ``1`` and
- ``9``; the default is ``9``. Add a ``'U'`` to mode to open the file for input
- in :term:`universal newlines` mode. Any line ending in the input file will be
- seen as a ``'\n'`` in Python. Also, a file so opened gains the attribute
- :attr:`newlines`; the value for this attribute is one of ``None`` (no newline
- read yet), ``'\r'``, ``'\n'``, ``'\r\n'`` or a tuple containing all the
- newline types seen. Universal newlines are available only when
- reading. Instances support iteration in the same way as normal :class:`file`
- instances.
+.. class:: BZ2File(filename, mode='r', buffering=None, compresslevel=9)
- :class:`BZ2File` supports the :keyword:`with` statement.
-
- .. versionchanged:: 3.1
- Support for the :keyword:`with` statement was added.
+ Open a bzip2-compressed file in binary mode.
+ If *filename* is a :class:`str` or :class:`bytes` object, open the named file
+ directly. Otherwise, *filename* should be a :term:`file object`, which will
+ be used to read or write the compressed data.
- .. note::
+ The *mode* argument can be either ``'r'`` for reading (default), ``'w'`` for
+ overwriting, or ``'a'`` for appending. These can equivalently be given as
+ ``'rb'``, ``'wb'``, and ``'ab'`` respectively.
- This class does not support input files containing multiple streams (such
- as those produced by the :program:`pbzip2` tool). When reading such an
- input file, only the first stream will be accessible. If you require
- support for multi-stream files, consider using the third-party
- :mod:`bz2file` module (available from
- `PyPI <http://pypi.python.org/pypi/bz2file>`_). This module provides a
- backport of Python 3.3's :class:`BZ2File` class, which does support
- multi-stream files.
+ If *filename* is a file object (rather than an actual file name), a mode of
+ ``'w'`` does not truncate the file, and is instead equivalent to ``'a'``.
+ The *buffering* argument is ignored. Its use is deprecated.
- .. method:: close()
+ If *mode* is ``'w'`` or ``'a'``, *compresslevel* can be a number between
+ ``1`` and ``9`` specifying the level of compression: ``1`` produces the
+ least compression, and ``9`` (default) produces the most compression.
- Close the file. Sets data attribute :attr:`closed` to true. A closed file
- cannot be used for further I/O operations. :meth:`close` may be called
- more than once without error.
+ If *mode* is ``'r'``, the input file may be the concatenation of multiple
+ compressed streams.
+ :class:`BZ2File` provides all of the members specified by the
+ :class:`io.BufferedIOBase`, except for :meth:`detach` and :meth:`truncate`.
+ Iteration and the :keyword:`with` statement are supported.
- .. method:: read([size])
+ :class:`BZ2File` also provides the following method:
- Read at most *size* uncompressed bytes, returned as a byte string. If the
- *size* argument is negative or omitted, read until EOF is reached.
+ .. method:: peek([n])
+ Return buffered data without advancing the file position. At least one
+ byte of data will be returned (unless at EOF). The exact number of bytes
+ returned is unspecified.
- .. method:: readline([size])
+ .. note:: While calling :meth:`peek` does not change the file position of
+ the :class:`BZ2File`, it may change the position of the underlying file
+ object (e.g. if the :class:`BZ2File` was constructed by passing a file
+ object for *filename*).
- Return the next line from the file, as a byte string, retaining newline.
- A non-negative *size* argument limits the maximum number of bytes to
- return (an incomplete line may be returned then). Return an empty byte
- string at EOF.
+ .. versionadded:: 3.3
+ .. versionchanged:: 3.1
+ Support for the :keyword:`with` statement was added.
- .. method:: readlines([size])
-
- Return a list of lines read. The optional *size* argument, if given, is an
- approximate bound on the total number of bytes in the lines returned.
+ .. versionchanged:: 3.3
+ The :meth:`fileno`, :meth:`readable`, :meth:`seekable`, :meth:`writable`,
+ :meth:`read1` and :meth:`readinto` methods were added.
+ .. versionchanged:: 3.3
+ Support was added for *filename* being a :term:`file object` instead of an
+ actual filename.
- .. method:: seek(offset[, whence])
+ .. versionchanged:: 3.3
+ The ``'a'`` (append) mode was added, along with support for reading
+ multi-stream files.
- Move to new file position. Argument *offset* is a byte count. Optional
- argument *whence* defaults to ``os.SEEK_SET`` or ``0`` (offset from start
- of file; offset should be ``>= 0``); other values are ``os.SEEK_CUR`` or
- ``1`` (move relative to current position; offset can be positive or
- negative), and ``os.SEEK_END`` or ``2`` (move relative to end of file;
- offset is usually negative, although many platforms allow seeking beyond
- the end of a file).
- Note that seeking of bz2 files is emulated, and depending on the
- parameters the operation may be extremely slow.
+Incremental (de)compression
+---------------------------
+.. class:: BZ2Compressor(compresslevel=9)
- .. method:: tell()
+ Create a new compressor object. This object may be used to compress data
+ incrementally. For one-shot compression, use the :func:`compress` function
+ instead.
- Return the current file position, an integer.
+ *compresslevel*, if given, must be a number between ``1`` and ``9``. The
+ default is ``9``.
+ .. method:: compress(data)
- .. method:: write(data)
+ Provide data to the compressor object. Returns a chunk of compressed data
+ if possible, or an empty byte string otherwise.
- Write the byte string *data* to file. Note that due to buffering,
- :meth:`close` may be needed before the file on disk reflects the data
- written.
+ When you have finished providing data to the compressor, call the
+ :meth:`flush` method to finish the compression process.
- .. method:: writelines(sequence_of_byte_strings)
+ .. method:: flush()
- Write the sequence of byte strings to the file. Note that newlines are not
- added. The sequence can be any iterable object producing byte strings.
- This is equivalent to calling write() for each byte string.
+ Finish the compression process. Returns the compressed data left in
+ internal buffers.
+ The compressor object may not be used after this method has been called.
-Sequential (de)compression
---------------------------
-Sequential compression and decompression is done using the classes
-:class:`BZ2Compressor` and :class:`BZ2Decompressor`.
+.. class:: BZ2Decompressor()
+ Create a new decompressor object. This object may be used to decompress data
+ incrementally. For one-shot compression, use the :func:`decompress` function
+ instead.
-.. class:: BZ2Compressor(compresslevel=9)
+ .. note::
+ This class does not transparently handle inputs containing multiple
+ compressed streams, unlike :func:`decompress` and :class:`BZ2File`. If
+ you need to decompress a multi-stream input with :class:`BZ2Decompressor`,
+ you must use a new decompressor for each stream.
- Create a new compressor object. This object may be used to compress data
- sequentially. If you want to compress data in one shot, use the
- :func:`compress` function instead. The *compresslevel* parameter, if given,
- must be a number between ``1`` and ``9``; the default is ``9``.
+ .. method:: decompress(data)
- .. method:: compress(data)
+ Provide data to the decompressor object. Returns a chunk of decompressed
+ data if possible, or an empty byte string otherwise.
- Provide more data to the compressor object. It will return chunks of
- compressed data whenever possible. When you've finished providing data to
- compress, call the :meth:`flush` method to finish the compression process,
- and return what is left in internal buffers.
+ Attempting to decompress data after the end of the current stream is
+ reached raises an :exc:`EOFError`. If any data is found after the end of
+ the stream, it is ignored and saved in the :attr:`unused_data` attribute.
- .. method:: flush()
+ .. attribute:: eof
- Finish the compression process and return what is left in internal
- buffers. You must not use the compressor object after calling this method.
+ ``True`` if the end-of-stream marker has been reached.
+ .. versionadded:: 3.3
-.. class:: BZ2Decompressor()
- Create a new decompressor object. This object may be used to decompress data
- sequentially. If you want to decompress data in one shot, use the
- :func:`decompress` function instead.
+ .. attribute:: unused_data
- .. method:: decompress(data)
+ Data found after the end of the compressed stream.
- Provide more data to the decompressor object. It will return chunks of
- decompressed data whenever possible. If you try to decompress data after
- the end of stream is found, :exc:`EOFError` will be raised. If any data
- was found after the end of stream, it'll be ignored and saved in
- :attr:`unused_data` attribute.
+ If this attribute is accessed before the end of the stream has been
+ reached, its value will be ``b''``.
One-shot (de)compression
------------------------
-One-shot compression and decompression is provided through the :func:`compress`
-and :func:`decompress` functions.
+.. function:: compress(data, compresslevel=9)
+ Compress *data*.
-.. function:: compress(data, compresslevel=9)
+ *compresslevel*, if given, must be a number between ``1`` and ``9``. The
+ default is ``9``.
- Compress *data* in one shot. If you want to compress data sequentially, use
- an instance of :class:`BZ2Compressor` instead. The *compresslevel* parameter,
- if given, must be a number between ``1`` and ``9``; the default is ``9``.
+ For incremental compression, use a :class:`BZ2Compressor` instead.
.. function:: decompress(data)
- Decompress *data* in one shot. If you want to decompress data sequentially,
- use an instance of :class:`BZ2Decompressor` instead.
+ Decompress *data*.
+
+ If *data* is the concatenation of multiple compressed streams, decompress
+ all of the streams.
+
+ For incremental decompression, use a :class:`BZ2Decompressor` instead.
+
+ .. versionchanged:: 3.3
+ Support for multi-stream inputs was added.