summaryrefslogtreecommitdiff
path: root/doc/gzip.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/gzip.texi')
-rw-r--r--doc/gzip.texi549
1 files changed, 549 insertions, 0 deletions
diff --git a/doc/gzip.texi b/doc/gzip.texi
new file mode 100644
index 0000000..1d8d100
--- /dev/null
+++ b/doc/gzip.texi
@@ -0,0 +1,549 @@
+\input texinfo @c -*-texinfo-*-
+@c %**start of header
+@setfilename gzip.info
+@documentencoding UTF-8
+@include version.texi
+@settitle GNU Gzip
+@finalout
+@setchapternewpage odd
+@c %**end of header
+@copying
+This manual is for GNU Gzip
+(version @value{VERSION}, @value{UPDATED}),
+and documents commands for compressing and decompressing data.
+
+Copyright @copyright{} 1998-1999, 2001-2002, 2006-2007, 2009-2016 Free Software
+Foundation, Inc.
+
+Copyright @copyright{} 1992, 1993 Jean-loup Gailly
+
+@quotation
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with no
+Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
+Texts. A copy of the license is included in the section entitled ``GNU
+Free Documentation License''.
+@end quotation
+@end copying
+
+@dircategory Compression
+@direntry
+* Gzip: (gzip). General (de)compression of files (lzw).
+@end direntry
+
+@dircategory Individual utilities
+@direntry
+* gunzip: (gzip)Overview. Decompression.
+* gzexe: (gzip)Overview. Compress executables.
+* zcat: (gzip)Overview. Decompression to stdout.
+* zdiff: (gzip)Overview. Compare compressed files.
+* zforce: (gzip)Overview. Force .gz extension on files.
+* zgrep: (gzip)Overview. Search compressed files.
+* zmore: (gzip)Overview. Decompression output by pages.
+@end direntry
+
+@titlepage
+@title GNU gzip
+@subtitle The data compression program
+@subtitle for Gzip version @value{VERSION}
+@subtitle @value{UPDATED}
+@author by Jean-loup Gailly
+
+@page
+@vskip 0pt plus 1filll
+@insertcopying
+@end titlepage
+
+@contents
+
+@ifnottex
+@node Top
+@top GNU Gzip: General file (de)compression
+
+@insertcopying
+@end ifnottex
+
+@menu
+* Overview:: Preliminary information.
+* Sample:: Sample output from @command{gzip}.
+* Invoking gzip:: How to run @command{gzip}.
+* Advanced usage:: Concatenated files.
+* Environment:: The @env{GZIP} environment variable
+* Tapes:: Using @command{gzip} on tapes.
+* Problems:: Reporting bugs.
+* GNU Free Documentation License:: Copying and sharing this manual.
+* Concept index:: Index of concepts.
+@end menu
+
+@node Overview
+@chapter Overview
+@cindex overview
+
+@command{gzip} reduces the size of the named files using Lempel--Ziv coding
+(LZ77). Whenever possible, each file is replaced by one with the
+extension @samp{.gz}, while keeping the same ownership modes, access and
+modification times. (The default extension is @samp{-gz} for @abbr{VMS},
+@samp{z} for @abbr{MSDOS}, @abbr{OS/2} @abbr{FAT} and Atari.)
+If no files are specified or
+if a file name is @file{-}, the standard input is compressed to the standard
+output. @command{gzip} will only attempt to compress regular files. In
+particular, it will ignore symbolic links.
+
+If the new file name is too long for its file system, @command{gzip}
+truncates it. @command{gzip} attempts to truncate only the parts of the
+file name longer than 3 characters. (A part is delimited by dots.) If
+the name consists of small parts only, the longest parts are truncated.
+For example, if file names are limited to 14 characters, gzip.msdos.exe
+is compressed to gzi.msd.exe.gz. Names are not truncated on systems
+which do not have a limit on file name length.
+
+By default, @command{gzip} keeps the original file name and time stamp in
+the compressed file. These are used when decompressing the file with the
+@option{-N} option. This is useful when the compressed file name was
+truncated or when the time stamp was not preserved after a file
+transfer. However, due to limitations in the current @command{gzip} file
+format, fractional seconds are discarded. Also, time stamps must fall
+within the range 1970-01-01 00:00:00 through 2106-02-07 06:28:15
+@abbr{UTC}, and hosts whose operating systems use 32-bit time
+stamps are further restricted to time stamps no later than 2038-01-19
+03:14:07 @abbr{UTC}. The upper bounds assume the typical case
+where leap seconds are ignored.
+
+Compressed files can be restored to their original form using @samp{gzip -d}
+or @command{gunzip} or @command{zcat}. If the original name saved in the
+compressed file is not suitable for its file system, a new name is
+constructed from the original one to make it legal.
+
+@command{gunzip} takes a list of files on its command line and replaces
+each file whose name ends with @samp{.gz}, @samp{.z}
+@samp{-gz}, @samp{-z}, or @samp{_z} (ignoring case)
+and which begins with the correct
+magic number with an uncompressed file without the original extension.
+@command{gunzip} also recognizes the special extensions @samp{.tgz} and
+@samp{.taz} as shorthands for @samp{.tar.gz} and @samp{.tar.Z}
+respectively. When compressing, @command{gzip} uses the @samp{.tgz}
+extension if necessary instead of truncating a file with a @samp{.tar}
+extension.
+
+@command{gunzip} can currently decompress files created by @command{gzip},
+@command{zip}, @command{compress} or @command{pack}. The detection of the input
+format is automatic. When using the first two formats, @command{gunzip}
+checks a 32 bit @abbr{CRC} (cyclic redundancy check). For @command{pack},
+@command{gunzip} checks the uncompressed length. The @command{compress} format
+was not designed to allow consistency checks. However @command{gunzip} is
+sometimes able to detect a bad @samp{.Z} file. If you get an error when
+uncompressing a @samp{.Z} file, do not assume that the @samp{.Z} file is
+correct simply because the standard @command{uncompress} does not complain.
+This generally means that the standard @command{uncompress} does not check
+its input, and happily generates garbage output. The @abbr{SCO} @samp{compress
+-H} format (@abbr{LZH} compression method) does not include a @abbr{CRC} but
+also allows some consistency checks.
+
+Files created by @command{zip} can be uncompressed by @command{gzip} only if
+they have a single member compressed with the ``deflation'' method. This
+feature is only intended to help conversion of @file{tar.zip} files to
+the @file{tar.gz} format. To extract a @command{zip} file with a single
+member, use a command like @samp{gunzip <foo.zip} or @samp{gunzip -S
+.zip foo.zip}. To extract @command{zip} files with several
+members, use @command{unzip} instead of @command{gunzip}.
+
+@command{zcat} is identical to @samp{gunzip -c}. @command{zcat}
+uncompresses either a list of files on the command line or its standard
+input and writes the uncompressed data on standard output. @command{zcat}
+will uncompress files that have the correct magic number whether they
+have a @samp{.gz} suffix or not.
+
+@command{gzip} uses the Lempel--Ziv algorithm used in @command{zip} and
+@abbr{PKZIP}@.
+The amount of compression obtained depends on the size of the input and
+the distribution of common substrings. Typically, text such as source
+code or English is reduced by 60--70%. Compression is generally much
+better than that achieved by @abbr{LZW} (as used in @command{compress}), Huffman
+coding (as used in @command{pack}), or adaptive Huffman coding
+(@command{compact}).
+
+Compression is always performed, even if the compressed file is slightly
+larger than the original. The worst case expansion is a few bytes for
+the @command{gzip} file header, plus 5 bytes every 32K block, or an expansion
+ratio of 0.015% for large files. Note that the actual number of used
+disk blocks almost never increases. @command{gzip} normally preserves the mode,
+ownership and time stamps of files when compressing or decompressing.
+
+The @command{gzip} file format is specified in P. Deutsch, GZIP file
+format specification version 4.3,
+@uref{http://www.ietf.org/rfc/rfc1952.txt, Internet @abbr{RFC} 1952} (May
+1996). The @command{zip} deflation format is specified in P. Deutsch,
+DEFLATE Compressed Data Format Specification version 1.3,
+@uref{http://www.ietf.org/rfc/rfc1951.txt, Internet @abbr{RFC} 1951} (May
+1996).
+
+@node Sample
+@chapter Sample output
+@cindex sample
+
+Here are some realistic examples of running @command{gzip}.
+
+This is the output of the command @samp{gzip -h}:
+
+@example
+Usage: gzip [OPTION]... [FILE]...
+Compress or uncompress FILEs (by default, compress FILES in-place).
+
+Mandatory arguments to long options are mandatory for short options too.
+
+ -c, --stdout write on standard output, keep original files unchanged
+ -d, --decompress decompress
+ -f, --force force overwrite of output file and compress links
+ -h, --help give this help
+ -k, --keep keep (don't delete) input files
+ -l, --list list compressed file contents
+ -L, --license display software license
+ -n, --no-name do not save or restore the original name and time stamp
+ -N, --name save or restore the original name and time stamp
+ -q, --quiet suppress all warnings
+ -r, --recursive operate recursively on directories
+ --rsyncable make rsync-friendly archive
+ -S, --suffix=SUF use suffix SUF on compressed files
+ --synchronous synchronous output (safer if system crashes, but slower)
+ -t, --test test compressed file integrity
+ -v, --verbose verbose mode
+ -V, --version display version number
+ -1, --fast compress faster
+ -9, --best compress better
+
+With no FILE, or when FILE is -, read standard input.
+
+Report bugs to <bug-gzip@@gnu.org>.
+@end example
+
+This is the output of the command @samp{gzip -v texinfo.tex}:
+
+@example
+texinfo.tex: 69.3% -- replaced with texinfo.tex.gz
+@end example
+
+The following command will find all regular @samp{.gz} files in the
+current directory and subdirectories (skipping file names that contain
+newlines), and extract them in place without destroying the original,
+stopping on the first failure:
+
+@example
+find . -name '*
+*' -prune -o -name '*.gz' -type f -print |
+ sed "
+ s/'/'\\\\''/g
+ s/^\\(.*\\)\\.gz$/gunzip <'\\1.gz' >'\\1'/
+ " |
+ sh -e
+@end example
+
+@node Invoking gzip
+@chapter Invoking @command{gzip}
+@cindex invoking
+@cindex options
+
+The format for running the @command{gzip} program is:
+
+@example
+gzip @var{option} @dots{}
+@end example
+
+@command{gzip} supports the following options:
+
+@table @option
+@item --stdout
+@itemx --to-stdout
+@itemx -c
+Write output on standard output; keep original files unchanged.
+If there are several input files, the output consists of a sequence of
+independently compressed members. To obtain better compression,
+concatenate all input files before compressing them.
+
+@item --decompress
+@itemx --uncompress
+@itemx -d
+Decompress.
+
+@item --force
+@itemx -f
+Force compression or decompression even if the file has multiple links
+or the corresponding file already exists, or if the compressed data
+is read from or written to a terminal. If the input data is not in
+a format recognized by @command{gzip}, and if the option @option{--stdout} is also
+given, copy the input data without change to the standard output: let
+@command{zcat} behave as @command{cat}. If @option{-f} is not given, and
+when not running in the background, @command{gzip} prompts to verify
+whether an existing file should be overwritten.
+
+@item --help
+@itemx -h
+Print an informative help message describing the options then quit.
+
+@item --keep
+@itemx -k
+Keep (don't delete) input files during compression or decompression.
+
+@item --list
+@itemx -l
+For each compressed file, list the following fields:
+
+@example
+compressed size: size of the compressed file
+uncompressed size: size of the uncompressed file
+ratio: compression ratio (0.0% if unknown)
+uncompressed_name: name of the uncompressed file
+@end example
+
+The uncompressed size is given as @minus{}1 for files not in @command{gzip}
+format, such as compressed @samp{.Z} files. To get the uncompressed size for
+such a file, you can use:
+
+@example
+zcat file.Z | wc -c
+@end example
+
+In combination with the @option{--verbose} option, the following fields are also
+displayed:
+
+@example
+method: compression method (deflate,compress,lzh,pack)
+crc: the 32-bit CRC of the uncompressed data
+date & time: time stamp for the uncompressed file
+@end example
+
+The @abbr{CRC} is given as ffffffff for a file not in gzip format.
+
+With @option{--verbose}, the size totals and compression ratio for all files
+is also displayed, unless some sizes are unknown. With @option{--quiet},
+the title and totals lines are not displayed.
+
+The @command{gzip} format represents the input size modulo
+@math{2^32}, so the uncompressed size and compression ratio are listed
+incorrectly for uncompressed files 4 GiB and larger. To work around
+this problem, you can use the following command to discover a large
+uncompressed file's true size:
+
+@example
+zcat file.gz | wc -c
+@end example
+
+@item --license
+@itemx -L
+Display the @command{gzip} license then quit.
+
+@item --no-name
+@itemx -n
+When compressing, do not save the original file name and time stamp by
+default. (The original name is always saved if the name had to be
+truncated.) When decompressing, do not restore the original file name
+if present (remove only the @command{gzip}
+suffix from the compressed file name) and do not restore the original
+time stamp if present (copy it from the compressed file). This option
+is the default when decompressing.
+
+@item --name
+@itemx -N
+When compressing, always save the original file name and time stamp; this
+is the default. When decompressing, restore the original file name and
+time stamp if present. This option is useful on systems which have
+a limit on file name length or when the time stamp has been lost after
+a file transfer.
+
+@item --quiet
+@itemx -q
+Suppress all warning messages.
+
+@item --recursive
+@itemx -r
+Travel the directory structure recursively. If any of the file names
+specified on the command line are directories, @command{gzip} will descend
+into the directory and compress all the files it finds there (or
+decompress them in the case of @command{gunzip}).
+
+@item --rsyncable
+Cater better to the @command{rsync} program by periodically resetting
+the internal structure of the compressed data stream. This lets the
+@code{rsync} program take advantage of similarities in the uncompressed
+input when synchronizing two files compressed with this flag. The cost:
+the compressed output is usually about one percent larger.
+
+@item --suffix @var{suf}
+@itemx -S @var{suf}
+Use suffix @var{suf} instead of @samp{.gz}. Any suffix can be
+given, but suffixes other than @samp{.z} and @samp{.gz} should be
+avoided to avoid confusion when files are transferred to other systems.
+A null suffix forces gunzip to try decompression on all given files
+regardless of suffix, as in:
+
+@example
+gunzip -S "" * (*.* for MSDOS)
+@end example
+
+Previous versions of gzip used the @samp{.z} suffix. This was changed to
+avoid a conflict with @command{pack}.
+
+@item --synchronous
+Use synchronous output, by transferring output data to the output
+file's storage device when the file system supports this. Because
+file system data can be cached, without this option if the system
+crashes around the time a command like @samp{gzip FOO} is run the user
+might lose both @file{FOO} and @file{FOO.gz}; this is the default with
+@command{gzip}, just as it is the default with most applications that
+move data. When this option is used, @command{gzip} is safer but can
+be considerably slower.
+
+@item --test
+@itemx -t
+Test. Check the compressed file integrity.
+
+@item --verbose
+@itemx -v
+Verbose. Display the name and percentage reduction for each file compressed.
+
+@item --version
+@itemx -V
+Version. Display the version number and compilation options, then quit.
+
+@item --fast
+@itemx --best
+@itemx -@var{n}
+Regulate the speed of compression using the specified digit @var{n},
+where @option{-1} or @option{--fast} indicates the fastest compression
+method (less compression) and @option{--best} or @option{-9} indicates the
+slowest compression method (optimal compression). The default
+compression level is @option{-6} (that is, biased towards high compression at
+expense of speed).
+@end table
+
+@node Advanced usage
+@chapter Advanced usage
+@cindex concatenated files
+
+Multiple compressed files can be concatenated. In this case,
+@command{gunzip} will extract all members at once. If one member is
+damaged, other members might still be recovered after removal of the
+damaged member. Better compression can be usually obtained if all
+members are decompressed and then recompressed in a single step.
+
+This is an example of concatenating @command{gzip} files:
+
+@example
+gzip -c file1 > foo.gz
+gzip -c file2 >> foo.gz
+@end example
+
+@noindent
+Then
+
+@example
+gunzip -c foo
+@end example
+
+@noindent
+is equivalent to
+
+@example
+cat file1 file2
+@end example
+
+In case of damage to one member of a @samp{.gz} file, other members can
+still be recovered (if the damaged member is removed). However,
+you can get better compression by compressing all members at once:
+
+@example
+cat file1 file2 | gzip > foo.gz
+@end example
+
+@noindent
+compresses better than
+
+@example
+gzip -c file1 file2 > foo.gz
+@end example
+
+If you want to recompress concatenated files to get better compression, do:
+
+@example
+zcat old.gz | gzip > new.gz
+@end example
+
+If a compressed file consists of several members, the uncompressed
+size and @abbr{CRC} reported by the @option{--list} option applies to
+the last member
+only. If you need the uncompressed size for all members, you can use:
+
+@example
+zcat file.gz | wc -c
+@end example
+
+If you wish to create a single archive file with multiple members so
+that members can later be extracted independently, use an archiver such
+as @command{tar} or @command{zip}. @acronym{GNU} @command{tar}
+supports the @option{-z}
+option to invoke @command{gzip} transparently. @command{gzip} is designed as a
+complement to @command{tar}, not as a replacement.
+
+@node Environment
+@chapter Environment
+@cindex Environment
+
+The obsolescent environment variable @env{GZIP} can hold a set of
+default options for @command{gzip}. These options are interpreted
+first and can be overwritten by explicit command line parameters. As
+this can cause problems when using scripts, this feature is supported
+only for options that are reasonably likely to not cause too much
+harm, and @command{gzip} warns if it is used. This feature will be
+removed in a future release of @command{gzip}.
+
+You can use an alias or script instead. For example, if
+@command{gzip} is in the directory @samp{/usr/bin} you can prepend
+@file{$HOME/bin} to your @env{PATH} and create an executable script
+@file{$HOME/bin/gzip} containing the following:
+
+@example
+#! /bin/sh
+export PATH=/usr/bin
+exec gzip -9 "$@@"
+@end example
+
+On @abbr{VMS}, the name of the obsolescent environment variable is
+@env{GZIP_OPT}, to avoid a conflict with the symbol set for invocation
+of the program.
+
+@node Tapes
+@chapter Using @command{gzip} on tapes
+@cindex tapes
+
+When writing compressed data to a tape, it is generally necessary to pad
+the output with zeroes up to a block boundary. When the data is read and
+the whole block is passed to @command{gunzip} for decompression,
+@command{gunzip} detects that there is extra trailing garbage after the
+compressed data and emits a warning by default if the garbage contains
+nonzero bytes. You can use the @option{--quiet} option to suppress
+the warning.
+
+@node Problems
+@chapter Reporting Bugs
+@cindex bugs
+
+If you find a bug in @command{gzip}, please send electronic mail to
+@email{bug-gzip@@gnu.org}. Include the version number,
+which you can find by running @w{@samp{gzip -V}}. Also include in your
+message the hardware and operating system, the compiler used to compile
+@command{gzip},
+a description of the bug behavior, and the input to @command{gzip}
+that triggered
+the bug.@refill
+
+@node GNU Free Documentation License
+@appendix GNU Free Documentation License
+
+@include fdl.texi
+
+@node Concept index
+@appendix Concept index
+
+@printindex cp
+
+@bye