summaryrefslogtreecommitdiff
path: root/proginfo
diff options
context:
space:
mode:
authorLorry <lorry@roadtrain.codethink.co.uk>2012-07-20 19:30:57 +0100
committerLorry <lorry@roadtrain.codethink.co.uk>2012-07-20 19:30:57 +0100
commit04664087ad66f5614f82a2cfba3ae4eda15e792b (patch)
tree332090b15fd2db1b93abf40dccf06211d9aba297 /proginfo
downloadzip-04664087ad66f5614f82a2cfba3ae4eda15e792b.tar.gz
Tarball conversion
Diffstat (limited to 'proginfo')
-rw-r--r--proginfo/3rdparty.bug114
-rw-r--r--proginfo/ZipPorts285
-rw-r--r--proginfo/algorith.txt68
-rw-r--r--proginfo/ebcdic.msg63
-rw-r--r--proginfo/extrafld.txt1372
-rw-r--r--proginfo/fileinfo.cms231
-rw-r--r--proginfo/infozip.who242
-rw-r--r--proginfo/ntsd.txt111
-rw-r--r--proginfo/perform.dos183
-rw-r--r--proginfo/timezone.txt85
-rw-r--r--proginfo/txtvsbin.txt112
-rw-r--r--proginfo/ziplimit.txt243
12 files changed, 3109 insertions, 0 deletions
diff --git a/proginfo/3rdparty.bug b/proginfo/3rdparty.bug
new file mode 100644
index 0000000..32e7823
--- /dev/null
+++ b/proginfo/3rdparty.bug
@@ -0,0 +1,114 @@
+Known, current PKZIP bugs/limitations:
+-------------------------------------
+
+ - PKUNZIP 2.04g is reported to corrupt some files when compressing them with
+ the -ex option; when tested, the files fail the CRC check, and comparison
+ with the original file shows bogus data (6K in one case) embedded in the
+ middle. PKWARE apparently characterized this as a "known problem."
+
+ - PKUNZIP 2.04g considers volume labels valid only if originated on a FAT
+ file system, but other OSes and file systems (e.g., Amiga and OS/2 HPFS)
+ support volume labels, too.
+
+ - PKUNZIP 2.04g can restore volume labels created by Zip 2.x but not by
+ PKZIP 2.04g (OS/2 DOS box only??).
+
+ - PKUNZIP 2.04g gives an error message for stored directory entries created
+ under other OSes (although it creates the directory anyway), and PKZIP -vt
+ does not report the directory attribute bit as being set, even if it is.
+
+ - PKZIP 2.04g mangles unknown extra fields (especially OS/2 extended attri-
+ butes) when adding new files to an existing zipfile [example: Walnut Creek
+ Hobbes March 1995 CD-ROM, FILE_ID.DIZ additions].
+
+ - PKUNZIP 2.04g is unable to detect or deal with prepended junk in a zipfile,
+ reporting CRC errors in valid compressed data.
+
+ - PKUNZIP 2.04g (registered version) incorrectly updates/freshens the AV extra
+ field in authenticated archives. The resultant extra block length and total
+ extra field length are inconsistent.
+
+ - [Windows version 2.01] Win95 long filenames (VFAT) are stored OK, but the
+ file system is always listed as ordinary DOS FAT.
+
+ - [Windows version 2.50] NT long filenames (NTFS) are stored OK, but the
+ file system is always listed as ordinary DOS FAT.
+
+ - PKZIP 2.04 for DOS encrypts using the OEM code page for 8-bit passwords,
+ while PKZIP 2.50 for Windows uses Latin-1 (ISO 8859-1). This means an
+ archive encrypted with an 8-bit password with one of the two PKZIP versions
+ cannot be decrypted with the other version.
+
+ - PKZIP for Windows GUI (v 2.60), PKZIP for Windows command line (v 2.50) and
+ PKZIP for Unix (v 2.51) save the host's native file timestamps, but
+ only in a local extra field. Thus, timestamp-related selections (update
+ or freshen, both in extraction or archiving operations) use the DOS-format
+ localtime records in the Zip archives for comparisons. This may result
+ in wrong decisions of the program when updating archives that were
+ previously created in a different local time zone.
+
+ - PKZIP releases newer than PKZIP for DOS 2.04g (PKZIP for Windows, both
+ GUI v 2.60 and console v 2.50; PKZIP for Unix v 2.51; probably others too)
+ use different code pages for storing filenames in central (OEM Codepage)
+ and local (ANSI / ISO 8859-1 Codepage) headers. When a stored filename
+ contains extended-ASCII characters, the local and central filename fields
+ do not match. As a consequence, Info-ZIP's Zip program considers such
+ archives as being corrupt and does not allow to modify them. Beginning
+ with release 5.41, Info-ZIP's UnZip contains a workaround to list AND
+ extract such archives with the correct filenames.
+ Maybe PKWARE has implemented this "feature" to allow extraction of their
+ "made-by-PKZIP for Unix/Windows" archives using old (v5.2 and earlier)
+ versions of Info-ZIP's UnZip for Unix/WinNT ??? (UnZip versions before
+ v 5.3 assumed that all archive entries were encoded in the codepage of
+ the UnZip program's host system.)
+
+ - PKUNZIP 2.04g is reported to have problems with archives created on and/or
+ copied from Iomega ZIP drives (irony, eh?).
+
+Known, current WinZip bugs/limitations:
+--------------------------------------
+
+ - [16-bit version 6.1a] NT short filenames (FAT) are stored OK, but the
+ file system is always listed as NTFS.
+
+ - WinZip doesn't allow 8-bit passwords, which means it cannot decrypt an
+ archive created with an 8-bit password (by PKZIP or Info-ZIP's Zip).
+
+ - WinZip (at least Versions 6.3 PL1, 7.0 SR1) fails to remove old extra
+ fields when freshening existing archive entries. When updating archives
+ created by Info-ZIP's Zip that contain UT time stamp extra field blocks,
+ UnZip cannot display or restore the updated (DOS) time stamps of the
+ freshened archive members.
+
+Known, current other third-party Zip utils bugs/limitations:
+------------------------------------------------------------
+
+ - Asi's PKZip clones for Macintosh (versions 2.3 and 2.10d) are thoroughly
+ broken. They create invalid Zip archives!
+ a) For the first entry, both compressed size and uncompressed length
+ are recorded as 0, despite the fact that compressed data of non-zero
+ length has been added.
+ b) Their program creates extra fields with an (undocumented) internal
+ structure that violates the requirements of PKWARE's Zip format
+ specification document "appnote.txt": Their extra field seems to
+ contain pure data; the 4-byte block header consisting of block ID
+ and data length is missing.
+
+Possibly current PKZIP bugs:
+---------------------------
+
+ - PKZIP (2.04g?) can silently ignore read errors on network drives, storing
+ the correct CRC and compressed length but an incorrect and inconsistent
+ uncompressed length.
+
+ - PKZIP (2.04g?), when deleting files from within a zipfile on a Novell
+ drive, sometimes only zeros out the data while failing to shrink the
+ zipfile.
+
+Other limitations:
+-----------------
+
+ - PKZIP 1.x and 2.x encryption has been cracked (known-plaintext approach;
+ see http://www.cryptography.com/ for details).
+
+[many other bugs in PKZIP 1.0, 1.1, 1.93a, 2.04c and 2.04e]
diff --git a/proginfo/ZipPorts b/proginfo/ZipPorts
new file mode 100644
index 0000000..2d946d3
--- /dev/null
+++ b/proginfo/ZipPorts
@@ -0,0 +1,285 @@
+__________________________________________________________________________
+
+ This is the Info-ZIP file ZipPorts, last updated on 17 February 1996.
+__________________________________________________________________________
+
+
+This document defines a set of rules and guidelines for those who wish to
+contribute patches to Zip and UnZip (or even entire ports to new operating
+systems). The list below is something between a style sheet and a "Miss
+Manners" etiquette guide. While Info-ZIP encourages contributions and
+fixes from anyone who finds something worth changing, we are also aware
+of the fact that no two programmers have the programming style and that
+unrestrained changes by a few dozen contributors would result in hideously
+ugly (and unmaintainable) Frankenstein code. So consider the following an
+attempt by the maintainers to maintain sanity as well as useful code.
+
+(The first version of this document was called either "ZipRules" or the
+"No Feelthy ..." file and was compiled by David Kirschbaum in consulta-
+tion with Mark Adler, Cave McNewt and others. The current incarnation
+expands upon the original with insights gained from a few more years of
+happy hacking...)
+
+
+Summary:
+
+ (0) The Platinum Rule: DON'T BREAK EXISTING PORTS
+(0.1) The Golden Rule: DO UNTO THE CODE AS OTHERS HAVE DONE BEFORE
+(0.2) The Silver Rule: DO UNTO THE LATEST BETA CODE
+(0.3) The Bronze Rule: NO FEELTHY PIGGYBACKS
+
+ (1) NO FEELTHY TABS
+ (2) NO FEELTHY CARRIAGE RETURNS
+ (3) NO FEELTHY 8-BIT CHARS
+ (4) NO FEELTHY LEFT-JUSTIFIED DASHES
+ (5) NO FEELTHY FANCY_FILENAMES
+ (6) NO FEELTHY NON-ZIPFILES AND NO FEELTHY E-MAIL BETAS
+ (7) NO FEELTHY E-MAIL BINARIES
+
+
+Explanations:
+
+ (0) The Platinum Rule: DON'T BREAK EXISTING PORTS
+
+ No doubt about it, this is the one which really pisses us off and
+ pretty much guarantees that your port or patch will be ignored and/
+ or laughed at. Examples range from the *really* severe cases which
+ "port" by ripping out all of the existing multi-OS code, to more
+ subtle oopers like relying on a local capability which doesn't exist
+ on other OSes or in older compilers (e.g., the use of ANSI "#elif"
+ or "#pragma" or "##" constructs, C++ comments, GNU extensions, etc.).
+ As to the former, use #ifdefs for your new code (see rule 0.3). And
+ as to the latter, trust us--there are few things we'd like better
+ than to be able to use some of the elegant "new" features out there
+ (many of which have been around for a decade or more). But our code
+ still compiles on machines dating back even longer, at least in spirit
+ --e.g., the AT&T 3B1 family and Dynix/ptx. Until we say otherwise,
+ dinosaurs are supported.
+
+
+(0.1) The Golden Rule: DO UNTO THE CODE AS OTHERS HAVE DONE BEFORE
+
+ In other words, try to fit into the local style of programming--no
+ matter how painful it may be. This includes cosmetic aspects like
+ indenting the same amount (both in the main C code and in the in-
+ clude files), using braces and comments similarly, NO TABS (see rule
+ #1), etc.; but also more substantive things like (for UnZip) putting
+ character strings into static (far) variables and using the LoadFar-
+ String macros to avoid overflowing limited MS-DOS data segments, and
+ using the ugly Info() macro instead of the more usual *printf()
+ functions so that dynamic-link-library ports are simpler. NEVER put
+ single-OS code (e.g., OS/2) of more than two or three lines into the
+ main (generic) modules; those are shared by everybody, and nobody else
+ cares about it or wants to see it.
+
+ Note that not only do Zip and UnZip differ in these respects, so do
+ individual parts of each program. While it would be nice to have
+ global consistency, cosmetic changes are not a high priority; for
+ now we'll settle for local consistency--i.e., don't make things any
+ worse than they already are.
+
+ Exception (BIG exception): single-letter variable names. Despite
+ the prevailing practice in much of Zip and parts of UnZip, and de-
+ spite the fact that one-letter variables allow you to pack really
+ cool, compact and complicated expressions onto one line, they also
+ make the code very difficult to maintain and are therefore *strongly*
+ discouraged. Don't ask us who is responsible in the first place;
+ while this sort of brain damage is not uncommon among former BASIC
+ programmers, it is nevertheless a lifelong embarrassment, and we do
+ try to pity the poor sod (that is, when we're not chasing bugs and
+ cursing him). :-)
+
+
+(0.2) The Silver Rule: DO UNTO THE LATEST BETA CODE
+
+ Few things are as annoying as receiving a large patch which obviously
+ represents a lot of time and careful work but which is relative to
+ an old version of Info-ZIP code. As wonderful as Larry Wall's patch
+ program is at applying context diffs to modified code, we regularly
+ make near-global changes and/or reorganize big chunks of the sources
+ (particularly in UnZip), and "patch" can't work miracles--big changes
+ invariably break any patch which is relative to an old version of the
+ code.
+
+ Bottom line: contact the Info-ZIP core team FIRST (via the zip-bugs
+ e-mail address) and get up to date with the latest code before begin-
+ ning a big new port. And try to *stay* up to date while working on
+ your port--at least, as much as possible.
+
+
+(0.3) The Bronze Rule: NO FEELTHY PIGGYBACKS
+
+ UnZip is currently ported to something like 12 operating systems
+ (a few more or less depending on how one counts), and each of these,
+ with the possible exception of VM/CMS, has a unique macro identifying
+ it: AMIGA, ATARI_ST, __human68k__, MACOS, MSDOS, MVS, OS2, TOPS20,
+ UNIX, VMS, WIN32. Zip is moving in the same direction. New ports
+ should NOT piggyback one of the existing ports unless they are sub-
+ stantially similar--for example, Minix and Coherent are basically Unix
+ and therefore are included in the UNIX macro, but DOS djgpp ports and
+ OS/2 emx ports (both of which use the Unix-originated GNU C compiler
+ and often have "unix" defined by default) are obviously *not* Unix.
+ [The existing MTS port is a special exception; basically only one per-
+ son knows what MTS really is, and he's not telling. Presumably it's
+ not very close to Unix, but it's not worth arguing about it now.]
+ Along the same lines, neither OS/2 nor Human68K is the same as (or
+ even close to) MS-DOS. MVS and VM/CMS, on the other hand, are quite
+ similar to each other and are therefore combined in most places.
+
+ Bottom line: when adding a new port (e.g., QDOS), create a new macro
+ for it ("QDOS"), a new subdirectory ("qdos") and a new source file for
+ OS-specific code ("qdos/qdos.c"). Use #ifdefs to fit any OS-specific
+ changes into the existing code (e.g., unzpriv.h). If it's close enough
+ to an existing port that piggybacking is a temptation, define a new
+ "combination macro" (e.g., "CMS_MVS") and replace the old macros as
+ required. (This last applies to UnZip, at least; the old preference
+ in Zip was fewer macros and long #ifdef lines, so talk to Onno or Jean-
+ loup about that.) See also rule 0.1.
+
+ (Note that, for UnZip, new ports need not attempt to deal with all
+ features. Among other things, the wildcard-zipfile code in do_wild()
+ may be replaced with a supplied dummy version, since opendir/readdir/
+ closedir() or the equivalent can be difficult to implement.)
+
+
+ (1) NO FEELTHY TABS
+
+ Some editors and e-mail systems either have no capability to use
+ and/or display tab characters (ASCII 9) correctly, or they use non-
+ standard or variable-width tab columns, or other horrors. Some edi-
+ tors auto-convert spaces to tabs, after which the blind use of "diff
+ -c" results in a huge and mostly useless patch. Yes, *we* know about
+ diff's "-b" option, but not everyone does. And yes, we also know this
+ makes the source files bigger, even after compression; so be it. If
+ we *really* cared that much about the size of the sources, we'd still
+ be writing Unix-only utilities.
+
+ Bottom line: use spaces, not tabs.
+
+ Exception: some of the makefiles (the Unix one in particular) require
+ tabs as part of the syntax.
+
+ Related utility programs:
+ Unix, OS/2 and MS-DOS: expand, unexpand.
+ MS-DOS: Buerg's TABS; Toad Hall's TOADSOFT.
+ And some editors have the conversion built-in.
+
+
+ (2) NO FEELTHY CARRIAGE RETURNS
+
+ All source, documentation and other text files shall have Unix style
+ line endings (LF only, a.k.a. ctrl-J), not the DOS/OS2/NT CR+LF or Mac
+ CR-only line endings.
+
+ Reason: "real programmers" in any environment can convert back and
+ forth between Unix and DOS/Mac style. All PC compilers but a few old
+ Borland versions can use either Unix or MS-DOS end-of-lines. Buerg's
+ LIST (file-display utility) for MS-DOS can use Unix or MS-DOS EOLs.
+ Both Zip and UnZip can convert line-endings as appropriate. But Unix
+ utilities like diff and patch die a horrible death (or produce horrible
+ output) if the target files have CRs.
+
+ Related utilities: flip for Unix, OS/2 and MS-DOS; Unix "tr".
+
+ Exceptions: documentation in pre-compiled binary distributions should
+ be in the local (target) format.
+
+
+ (3) NO FEELTHY 8-BIT CHARS
+
+ Do all your editing in a plain-text ASCII editor. No WordPerfect, MS
+ Word, WordStar document mode, or other word processor files, thenkyew.
+ No desktop publishing. *Especially* no EBCDIC. No TIFFs, no GIFs, no
+ embedded pictures or dancing ladies (too bad, Cave Newt). [Sigh... -CN]
+
+ Reason: compatibility with different consoles. My old XT clone is
+ the most limited!
+
+ Exceptions: some Macintosh makefiles apparently require some 8-bit
+ characters; the Human68k port uses 8-bit characters for Kanji or Kana
+ comments (I think); etc.
+
+ Related utilities: vi, emacs, EDLIN, Turbo C editor, other programmers'
+ editors, various word processor -> text conversion utilities.
+
+
+ (4) NO FEELTHY LEFT-JUSTIFIED DASHES
+
+ Always precede repeated dashes (------) with one or more leading non-
+ dash characters: spaces, tabs, pound signs (#), comments (/*), what-
+ ever.
+
+ Reason: sooner or later your source file will be e-mailed through an
+ undigestifier utility, most of which treat leading dashes as end-of-
+ message separators. We'd rather not have your code broken up into a
+ dozen separate untitled messages, thank you.
+
+
+ (5) NO FEELTHY FANCY_FILENAMES
+
+ Assume the worst: that someone on a brain-damaged DOS system has to
+ work with everything your magic fingers produced. Keep the filenames
+ unimaginative and within MS-DOS limits (i.e., ordinary A..Z, 1..9,
+ "-$_!"-type characters, in the 8.3 "filename.ext" format). Mac and
+ Unix users, giggle all you want, but no spaces or multiple dots.
+
+ Reason: compatibility with different file systems. MS-DOS FAT is the
+ most limited, with the exception of CompuServe (6.3, argh).
+
+ Exceptions: slightly longer names are occasionally acceptable within
+ OS-specific subdirectories, but don't do that unless there's a good
+ reason for it.
+
+
+ (6) NO FEELTHY NON-ZIPFILES AND NO FEELTHY E-MAIL BETAS
+
+ Beta testers and developers are in general expected to have both
+ ftp capability and the ability to deal with zipfiles. Those without
+ should either find a friend who does or else learn about ftp-mailers.
+
+ Reason: the core development team barely has time to work on the
+ code, much less prepare oddball formats and/or mail betas out (and
+ the situation is getting worse, sigh).
+
+ Exceptions: anyone seriously proposing to do a new port will be
+ given special treatment, particularly with respect to UnZip; we
+ obviously realize that bootstrapping a completely new port can be
+ quite difficult and have no desire to make it even harder due to
+ lack of access to the latest code (rule 0.2).
+
+ Public releases of UnZip, on the other hand, will be available in
+ two formats: .tar.Z (16-bit compress'd tar) and .zip (either "plain"
+ or self-extracting). Zip sources and executables will generally only
+ be distributed in .zip format, since Zip is pretty much useless without
+ UnZip.
+
+
+ (7) NO FEELTHY E-MAIL BINARIES
+
+ Binary files (e.g., executables, test zipfiles, etc.) should NEVER
+ be mailed raw. Where possible, they should be uploaded via ftp in
+ BINARY mode; if that's impossible, Mark's "ship" ASCII-encoder should
+ be used; and if that's unavailable, uuencode or xxencode should be
+ used. Weirdo NeXTmail, mailtool and MIME formats are also Right Out.
+
+ Files larger than 50KB may need to be broken into pieces for mailing
+ (be sure to label them in order!), unless "ship" is used (it can
+ auto-split, label and mail files if told to do so). If Down Under
+ is involved, files must be broken into under-20KB chunks.
+
+ Reasons: to prevent sounds of gagging mailers from resounding through-
+ out the land. To be relatively efficient in the binary->ASCII conver-
+ sion. (Yeah, yeah, I know, there's better conversions out there. But
+ not as widely known, and they often break on BITNET gateways.)
+
+ Related utilities: ship, uuencode, uudecode, uuxfer20, quux, others.
+ Just make sure they don't leave embedded or trailing spaces (that is,
+ they should use the "`" character in place of ASCII 32). Otherwise
+ mailers are prone to truncate or whatever.
+
+
+Greg Roelofs (a.k.a. Cave Newt)
+Info-ZIP UnZip maintainer
+
+David Kirschbaum
+former Info-ZIP Coordinator
diff --git a/proginfo/algorith.txt b/proginfo/algorith.txt
new file mode 100644
index 0000000..867e30b
--- /dev/null
+++ b/proginfo/algorith.txt
@@ -0,0 +1,68 @@
+Zip's deflation algorithm is a variation of LZ77 (Lempel-Ziv 1977, see
+reference below). It finds duplicated strings in the input data. The
+second occurrence of a string is replaced by a pointer to the previous
+string, in the form of a pair (distance, length). Distances are
+limited to 32K bytes, and lengths are limited to 258 bytes. When a
+string does not occur anywhere in the previous 32K bytes, it is
+emitted as a sequence of literal bytes. (In this description,
+'string' must be taken as an arbitrary sequence of bytes, and is not
+restricted to printable characters.)
+
+Literals or match lengths are compressed with one Huffman tree, and
+match distances are compressed with another tree. The trees are stored
+in a compact form at the start of each block. The blocks can have any
+size (except that the compressed data for one block must fit in
+available memory). A block is terminated when zip determines that it
+would be useful to start another block with fresh trees. (This is
+somewhat similar to compress.)
+
+Duplicated strings are found using a hash table. All input strings of
+length 3 are inserted in the hash table. A hash index is computed for
+the next 3 bytes. If the hash chain for this index is not empty, all
+strings in the chain are compared with the current input string, and
+the longest match is selected.
+
+The hash chains are searched starting with the most recent strings, to
+favor small distances and thus take advantage of the Huffman encoding.
+The hash chains are singly linked. There are no deletions from the
+hash chains, the algorithm simply discards matches that are too old.
+
+To avoid a worst-case situation, very long hash chains are arbitrarily
+truncated at a certain length, determined by a runtime option (zip -1
+to -9). So zip does not always find the longest possible match but
+generally finds a match which is long enough.
+
+zip also defers the selection of matches with a lazy evaluation
+mechanism. After a match of length N has been found, zip searches for a
+longer match at the next input byte. If a longer match is found, the
+previous match is truncated to a length of one (thus producing a single
+literal byte) and the longer match is emitted afterwards. Otherwise,
+the original match is kept, and the next match search is attempted only
+N steps later.
+
+The lazy match evaluation is also subject to a runtime parameter. If
+the current match is long enough, zip reduces the search for a longer
+match, thus speeding up the whole process. If compression ratio is more
+important than speed, zip attempts a complete second search even if
+the first match is already long enough.
+
+The lazy match evaluation is not performed for the fastest compression
+modes (speed options -1 to -3). For these fast modes, new strings
+are inserted in the hash table only when no match was found, or
+when the match is not too long. This degrades the compression ratio
+but saves time since there are both fewer insertions and fewer searches.
+
+Jean-loup Gailly
+jloup@chorus.fr
+
+References:
+
+[LZ77] Ziv J., Lempel A., "A Universal Algorithm for Sequential Data
+Compression", IEEE Transactions on Information Theory", Vol. 23, No. 3,
+pp. 337-343.
+
+APPNOTE.TXT documentation file in PKZIP 1.93a. It is available by
+ftp in ftp.cso.uiuc.edu:/pc/exec-pc/pkz193a.exe [128.174.5.59]
+
+'Deflate' Compressed Data Format Specification:
+ftp://ftp.uu.net/pub/archiving/zip/doc/deflate-1.1.doc
diff --git a/proginfo/ebcdic.msg b/proginfo/ebcdic.msg
new file mode 100644
index 0000000..1a7bbad
--- /dev/null
+++ b/proginfo/ebcdic.msg
@@ -0,0 +1,63 @@
+From dima@mitrah.ru Mon Nov 10 02:25:38 2003
+Return-Path: <dima@mitrah.ru>
+Received: from b.mx.sonic.net (eth0.b.mx.sonic.net [209.204.159.4])
+ by eth0.a.lds.sonic.net (8.12.10/8.12.9) with ESMTP id hAAAPccT025257
+ for <roelofs@lds.sonic.net>; Mon, 10 Nov 2003 02:25:38 -0800
+Received: from icicle.pobox.com (icicle.pobox.com [207.8.214.2])
+ by b.mx.sonic.net (8.12.10/8.12.7) with ESMTP id hAAAPar9007141
+ for <roelofs@sonic.net>; Mon, 10 Nov 2003 02:25:37 -0800
+Received: from icicle.pobox.com (localhost[127.0.0.1])
+ by icicle.pobox.com (Postfix) with ESMTP id 9BA347A96B
+ for <roelofs@sonic.net>; Sat, 8 Nov 2003 06:15:13 -0500 (EST)
+Delivered-To: newt@pobox.com
+Received: from mail.ropnet.ru (mail.ropnet.ru[212.42.37.90])
+ by icicle.pobox.com (Postfix) with ESMTP id A96817A8F7
+ for <newt@pobox.com>; Sat, 8 Nov 2003 06:15:04 -0500 (EST)
+Received: from d34-67.ropnet.ru (d34-67.ropnet.ru [212.42.34.67])
+ by mail.ropnet.ru (8.11.7/8.11.7) with ESMTP id hA8BEjF76200
+ for <newt@pobox.com>; Sat, 8 Nov 2003 14:14:46 +0300 (MSK)
+Resent-Date: Sat, 8 Nov 2003 14:14:46 +0300 (MSK)
+Resent-Message-Id: <200311081114.hA8BEjF76200@mail.ropnet.ru>
+Date: Sat, 8 Nov 2003 14:18:18 +0300
+From: Dmitri Koulikov <dima@mitrah.ru>
+X-Mailer: The Bat! (v1.62r) Personal
+Reply-To: Dmitri Koulikov <dima@mitrah.ru>
+X-Priority: 3 (Normal)
+Message-ID: <815640011.20031108141818@mitrah.ru>
+To: newt@pobox.com
+Subject: unzip and zip lack NLS - 2
+Resent-From: Dmitri Koulikov <dima@mitrah.ru>
+MIME-Version: 1.0
+Content-Type: multipart/mixed; boundary="----------EB1581C42AB86662"
+Status: R
+
+------------EB1581C42AB86662
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+
+Hello Greg Roelofs,
+
+ By mistake I sent you wrong version of ebcdic.h. Now it is as it
+have to.
+ Additionally I found that zip works with Russian filenames good.
+But fails to process -D switch. So I have to chahge zipfile.c. Most
+probably this is not good but it works.
+
+--
+Best regards,
+ Dmitri
+
+mailto:dima@mitrah.ru
+------------EB1581C42AB86662
+Content-Type: application/octet-stream; name="ebcdic.h"
+Content-Transfer-Encoding: base64
+Content-Disposition: attachment; filename="ebcdic.h"
+
+------------EB1581C42AB86662
+Content-Type: application/octet-stream; name="zipfile.c"
+Content-Transfer-Encoding: base64
+Content-Disposition: attachment; filename="zipfile.c"
+
+------------EB1581C42AB86662--
+
+
diff --git a/proginfo/extrafld.txt b/proginfo/extrafld.txt
new file mode 100644
index 0000000..624e05c
--- /dev/null
+++ b/proginfo/extrafld.txt
@@ -0,0 +1,1372 @@
+The following are the known types of zipfile extra fields as of this
+writing. Extra fields are documented in PKWARE's appnote.txt and are
+intended to allow for backward- and forward-compatible extensions to
+the zipfile format. Multiple extra-field types may be chained together,
+provided that the total length of all extra-field data is less than 64KB.
+(In fact, PKWARE requires that the total length of the entire file header,
+including timestamp, file attributes, filename, comment, extra field, etc.,
+be no more than 64KB.)
+
+Each extra-field type (or subblock) must contain a four-byte header con-
+sisting of a two-byte header ID and a two-byte length (little-endian) for
+the remaining data in the subblock. If there are additional subblocks
+within the extra field, the header for each one will appear immediately
+following the data for the previous subblock (i.e., with no padding for
+alignment).
+
+All integer fields in the descriptions below are in little-endian (Intel)
+format unless otherwise specified. Note that "Short" means two bytes,
+"Long" means four bytes, and "Long-Long" means eight bytes, regardless
+of their native sizes. Unless specifically noted, all integer fields should
+be interpreted as unsigned (non-negative) numbers.
+
+Christian Spieler, 20010517
+
+Updated to include the Unicode extra fields. Added new Unix extra field.
+
+Ed Gordon, 20060819, 20070607, 20070909, 20080426, 20080509
+
+ -------------------------
+
+ Header ID's of 0 thru 31 are reserved for use by PKWARE.
+ The remaining ID's can be used by third party vendors for
+ proprietary usage.
+
+ The current Header ID mappings defined by PKWARE are:
+
+ 0x0001 ZIP64 extended information extra field
+ 0x0007 AV Info
+ 0x0009 OS/2 extended attributes (also Info-ZIP)
+ 0x000a NTFS (Win9x/WinNT FileTimes)
+ 0x000c OpenVMS (also Info-ZIP)
+ 0x000d Unix
+ 0x000f Patch Descriptor
+ 0x0014 PKCS#7 Store for X.509 Certificates
+ 0x0015 X.509 Certificate ID and Signature for
+ individual file
+ 0x0016 X.509 Certificate ID for Central Directory
+
+ The Header ID mappings defined by Info-ZIP and third parties are:
+
+ 0x0065 IBM S/390 attributes - uncompressed
+ 0x0066 IBM S/390 attributes - compressed
+ 0x07c8 Info-ZIP Macintosh (old, J. Lee)
+ 0x2605 ZipIt Macintosh (first version)
+ 0x2705 ZipIt Macintosh v 1.3.5 and newer (w/o full filename)
+ 0x334d Info-ZIP Macintosh (new, D. Haase's 'Mac3' field )
+ 0x4154 Tandem NSK
+ 0x4341 Acorn/SparkFS (David Pilling)
+ 0x4453 Windows NT security descriptor (binary ACL)
+ 0x4704 VM/CMS
+ 0x470f MVS
+ 0x4854 Theos, old inofficial port
+ 0x4b46 FWKCS MD5 (see below)
+ 0x4c41 OS/2 access control list (text ACL)
+ 0x4d49 Info-ZIP OpenVMS (obsolete)
+ 0x4d63 Macintosh SmartZIP, by Macro Bambini
+ 0x4f4c Xceed original location extra field
+ 0x5356 AOS/VS (binary ACL)
+ 0x5455 extended timestamp
+ 0x5855 Info-ZIP Unix (original; also OS/2, NT, etc.)
+ 0x554e Xceed unicode extra field
+ 0x6375 Info-ZIP Unicode Comment
+ 0x6542 BeOS (BeBox, PowerMac, etc.)
+ 0x6854 Theos
+ 0x7075 Info-ZIP Unicode Path
+ 0x756e ASi Unix
+ 0x7855 Info-ZIP Unix (previous new)
+ 0x7875 Info-ZIP Unix (new)
+ 0xfb4a SMS/QDOS
+
+The following are detailed descriptions of the known extra-field block types:
+
+ -OS/2 Extended Attributes Extra Field:
+ ====================================
+
+ The following is the layout of the OS/2 extended attributes "extra"
+ block. (Last Revision 19960922)
+
+ Note: all fields stored in Intel low-byte/high-byte order.
+
+ Local-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (OS/2) 0x0009 Short tag for this extra block type
+ TSize Short total data size for this block
+ BSize Long uncompressed EA data size
+ CType Short compression type
+ EACRC Long CRC value for uncompressed EA data
+ (var.) variable compressed EA data
+
+ Central-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (OS/2) 0x0009 Short tag for this extra block type
+ TSize Short total data size for this block (4)
+ BSize Long size of uncompressed local EA data
+
+ The value of CType is interpreted according to the "compression
+ method" section above; i.e., 0 for stored, 8 for deflated, etc.
+
+ The OS/2 extended attribute structure (FEA2LIST) is compressed and
+ then stored in its entirety within this structure. There will only
+ ever be one block of data in the variable-length field.
+
+
+ -OS/2 Access Control List Extra Field:
+ ====================================
+
+ The following is the layout of the OS/2 ACL extra block.
+ (Last Revision 19960922)
+
+ Local-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (ACL) 0x4c41 Short tag for this extra block type ("AL")
+ TSize Short total data size for this block
+ BSize Long uncompressed ACL data size
+ CType Short compression type
+ EACRC Long CRC value for uncompressed ACL data
+ (var.) variable compressed ACL data
+
+ Central-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (ACL) 0x4c41 Short tag for this extra block type ("AL")
+ TSize Short total data size for this block (4)
+ BSize Long size of uncompressed local ACL data
+
+ The value of CType is interpreted according to the "compression
+ method" section above; i.e., 0 for stored, 8 for deflated, etc.
+
+ The uncompressed ACL data consist of a text header of the form
+ "ACL1:%hX,%hd\n", where the first field is the OS/2 ACCINFO acc_attr
+ member and the second is acc_count, followed by acc_count strings
+ of the form "%s,%hx\n", where the first field is acl_ugname (user
+ group name) and the second acl_access. This block type will be
+ extended for other operating systems as needed.
+
+
+ -Windows NT Security Descriptor Extra Field:
+ ==========================================
+
+ The following is the layout of the NT Security Descriptor (another
+ type of ACL) extra block. (Last Revision 19960922)
+
+ Local-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (SD) 0x4453 Short tag for this extra block type ("SD")
+ TSize Short total data size for this block
+ BSize Long uncompressed SD data size
+ Version Byte version of uncompressed SD data format
+ CType Short compression type
+ EACRC Long CRC value for uncompressed SD data
+ (var.) variable compressed SD data
+
+ Central-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (SD) 0x4453 Short tag for this extra block type ("SD")
+ TSize Short total data size for this block (4)
+ BSize Long size of uncompressed local SD data
+
+ The value of CType is interpreted according to the "compression
+ method" section above; i.e., 0 for stored, 8 for deflated, etc.
+ Version specifies how the compressed data are to be interpreted
+ and allows for future expansion of this extra field type. Currently
+ only version 0 is defined.
+
+ For version 0, the compressed data are to be interpreted as a single
+ valid Windows NT SECURITY_DESCRIPTOR data structure, in self-relative
+ format.
+
+
+ -PKWARE Win95/WinNT Extra Field:
+ ==============================
+
+ The following description covers PKWARE's "NTFS" attributes
+ "extra" block, introduced with the release of PKZIP 2.50 for
+ Windows. (Last Revision 20001118)
+
+ (Note: At this time the Mtime, Atime and Ctime values may
+ be used on any WIN32 system.)
+ [Info-ZIP note: In the current implementations, this field has
+ a fixed total data size of 32 bytes and is only stored as local
+ extra field.]
+
+ Value Size Description
+ ----- ---- -----------
+ (NTFS) 0x000a Short Tag for this "extra" block type
+ TSize Short Total Data Size for this block
+ Reserved Long for future use
+ Tag1 Short NTFS attribute tag value #1
+ Size1 Short Size of attribute #1, in bytes
+ (var.) SubSize1 Attribute #1 data
+ .
+ .
+ .
+ TagN Short NTFS attribute tag value #N
+ SizeN Short Size of attribute #N, in bytes
+ (var.) SubSize1 Attribute #N data
+
+ For NTFS, values for Tag1 through TagN are as follows:
+ (currently only one set of attributes is defined for NTFS)
+
+ Tag Size Description
+ ----- ---- -----------
+ 0x0001 2 bytes Tag for attribute #1
+ Size1 2 bytes Size of attribute #1, in bytes (24)
+ Mtime 8 bytes 64-bit NTFS file last modification time
+ Atime 8 bytes 64-bit NTFS file last access time
+ Ctime 8 bytes 64-bit NTFS file creation time
+
+ The total length for this block is 28 bytes, resulting in a
+ fixed size value of 32 for the TSize field of the NTFS block.
+
+ The NTFS filetimes are 64-bit unsigned integers, stored in Intel
+ (least significant byte first) byte order. They determine the
+ number of 1.0E-07 seconds (1/10th microseconds!) past WinNT "epoch",
+ which is "01-Jan-1601 00:00:00 UTC".
+
+
+ -PKWARE OpenVMS Extra Field:
+ ==========================
+
+ The following is the layout of PKWARE's OpenVMS attributes "extra"
+ block. (Last Revision 12/17/91)
+
+ Note: all fields stored in Intel low-byte/high-byte order.
+
+ Value Size Description
+ ----- ---- -----------
+ (VMS) 0x000c Short Tag for this "extra" block type
+ TSize Short Total Data Size for this block
+ CRC Long 32-bit CRC for remainder of the block
+ Tag1 Short OpenVMS attribute tag value #1
+ Size1 Short Size of attribute #1, in bytes
+ (var.) Size1 Attribute #1 data
+ .
+ .
+ .
+ TagN Short OpenVMS attribute tage value #N
+ SizeN Short Size of attribute #N, in bytes
+ (var.) SizeN Attribute #N data
+
+ Rules:
+
+ 1. There will be one or more of attributes present, which
+ will each be preceded by the above TagX & SizeX values.
+ These values are identical to the ATR$C_XXXX and
+ ATR$S_XXXX constants which are defined in ATR.H under
+ OpenVMS C. Neither of these values will ever be zero.
+
+ 2. No word alignment or padding is performed.
+
+ 3. A well-behaved PKZIP/OpenVMS program should never produce
+ more than one sub-block with the same TagX value. Also,
+ there will never be more than one "extra" block of type
+ 0x000c in a particular directory record.
+
+
+ -Info-ZIP VMS Extra Field:
+ ========================
+
+ The following is the layout of Info-ZIP's VMS attributes extra
+ block for VAX or Alpha AXP. The local-header and central-header
+ versions are identical. (Last Revision 19960922)
+
+ Value Size Description
+ ----- ---- -----------
+ (VMS2) 0x4d49 Short tag for this extra block type ("JM")
+ TSize Short total data size for this block
+ ID Long block ID
+ Flags Short info bytes
+ BSize Short uncompressed block size
+ Reserved Long (reserved)
+ (var.) variable compressed VMS file-attributes block
+
+ The block ID is one of the following unterminated strings:
+
+ "VFAB" struct FAB
+ "VALL" struct XABALL
+ "VFHC" struct XABFHC
+ "VDAT" struct XABDAT
+ "VRDT" struct XABRDT
+ "VPRO" struct XABPRO
+ "VKEY" struct XABKEY
+ "VMSV" version (e.g., "V6.1"; truncated at hyphen)
+ "VNAM" reserved
+
+ The lower three bits of Flags indicate the compression method. The
+ currently defined methods are:
+
+ 0 stored (not compressed)
+ 1 simple "RLE"
+ 2 deflated
+
+ The "RLE" method simply replaces zero-valued bytes with zero-valued
+ bits and non-zero-valued bytes with a "1" bit followed by the byte
+ value.
+
+ The variable-length compressed data contains only the data corre-
+ sponding to the indicated structure or string. Typically multiple
+ VMS2 extra fields are present (each with a unique block type).
+
+
+ -Info-ZIP Macintosh Extra Field:
+ ==============================
+
+ The following is the layout of the (old) Info-ZIP resource-fork extra
+ block for Macintosh. The local-header and central-header versions
+ are identical. (Last Revision 19960922)
+
+ Value Size Description
+ ----- ---- -----------
+ (Mac) 0x07c8 Short tag for this extra block type
+ TSize Short total data size for this block
+ "JLEE" beLong extra-field signature
+ FInfo 16 bytes Macintosh FInfo structure
+ CrDat beLong HParamBlockRec fileParam.ioFlCrDat
+ MdDat beLong HParamBlockRec fileParam.ioFlMdDat
+ Flags beLong info bits
+ DirID beLong HParamBlockRec fileParam.ioDirID
+ VolName 28 bytes volume name (optional)
+
+ All fields but the first two are in native Macintosh format
+ (big-endian Motorola order, not little-endian Intel). The least
+ significant bit of Flags is 1 if the file is a data fork, 0 other-
+ wise. In addition, if this extra field is present, the filename
+ has an extra 'd' or 'r' appended to indicate data fork or resource
+ fork. The 28-byte VolName field may be omitted.
+
+
+ -ZipIt Macintosh Extra Field (long):
+ ==================================
+
+ The following is the layout of the ZipIt extra block for Macintosh.
+ The local-header and central-header versions are identical.
+ (Last Revision 19970130)
+
+ Value Size Description
+ ----- ---- -----------
+ (Mac2) 0x2605 Short tag for this extra block type
+ TSize Short total data size for this block
+ "ZPIT" beLong extra-field signature
+ FnLen Byte length of FileName
+ FileName variable full Macintosh filename
+ FileType Byte[4] four-byte Mac file type string
+ Creator Byte[4] four-byte Mac creator string
+
+
+ -ZipIt Macintosh Extra Field (short):
+ ===================================
+
+ The following is the layout of a shortened variant of the
+ ZipIt extra block for Macintosh (without "full name" entry).
+ This variant is used by ZipIt 1.3.5 and newer for entries that
+ do not need a "full Mac filename" record.
+ The local-header and central-header versions are identical.
+ (Last Revision 19980903)
+
+ Value Size Description
+ ----- ---- -----------
+ (Mac2b) 0x2705 Short tag for this extra block type
+ TSize Short total data size for this block (12)
+ "ZPIT" beLong extra-field signature
+ FileType Byte[4] four-byte Mac file type string
+ Creator Byte[4] four-byte Mac creator string
+
+
+ -Info-ZIP Macintosh Extra Field (new):
+ ====================================
+
+ The following is the layout of the (new) Info-ZIP extra
+ block for Macintosh, designed by Dirk Haase.
+ All values are in little-endian.
+ (Last Revision 19981005)
+
+ Local-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (Mac3) 0x334d Short tag for this extra block type ("M3")
+ TSize Short total data size for this block
+ BSize Long uncompressed finder attribute data size
+ Flags Short info bits
+ fdType Byte[4] Type of the File (4-byte string)
+ fdCreator Byte[4] Creator of the File (4-byte string)
+ (CType) Short compression type
+ (CRC) Long CRC value for uncompressed MacOS data
+ Attribs variable finder attribute data (see below)
+
+
+ Central-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (Mac3) 0x334d Short tag for this extra block type ("M3")
+ TSize Short total data size for this block
+ BSize Long uncompressed finder attribute data size
+ Flags Short info bits
+ fdType Byte[4] Type of the File (4-byte string)
+ fdCreator Byte[4] Creator of the File (4-byte string)
+
+ The third bit of Flags in both headers indicates whether
+ the LOCAL extra field is uncompressed (and therefore whether CType
+ and CRC are omitted):
+
+ Bits of the Flags:
+ bit 0 if set, file is a data fork; otherwise unset
+ bit 1 if set, filename will be not changed
+ bit 2 if set, Attribs is uncompressed (no CType, CRC)
+ bit 3 if set, date and times are in 64 bit
+ if zero date and times are in 32 bit.
+ bit 4 if set, timezone offsets fields for the native
+ Mac times are omitted (UTC support deactivated)
+ bits 5-15 reserved;
+
+
+ Attributes:
+
+ Attribs is a Mac-specific block of data in little-endian format with
+ the following structure (if compressed, uncompress it first):
+
+ Value Size Description
+ ----- ---- -----------
+ fdFlags Short Finder Flags
+ fdLocation.v Short Finder Icon Location
+ fdLocation.h Short Finder Icon Location
+ fdFldr Short Folder containing file
+
+ FXInfo 16 bytes Macintosh FXInfo structure
+ FXInfo-Structure:
+ fdIconID Short
+ fdUnused[3] Short unused but reserved 6 bytes
+ fdScript Byte Script flag and number
+ fdXFlags Byte More flag bits
+ fdComment Short Comment ID
+ fdPutAway Long Home Dir ID
+
+ FVersNum Byte file version number
+ may be not used by MacOS
+ ACUser Byte directory access rights
+
+ FlCrDat ULong date and time of creation
+ FlMdDat ULong date and time of last modification
+ FlBkDat ULong date and time of last backup
+ These time numbers are original Mac FileTime values (local time!).
+ Currently, date-time width is 32-bit, but future version may
+ support be 64-bit times (see flags)
+
+ CrGMTOffs Long(signed!) difference "local Creat. time - UTC"
+ MdGMTOffs Long(signed!) difference "local Modif. time - UTC"
+ BkGMTOffs Long(signed!) difference "local Backup time - UTC"
+ These "local time - UTC" differences (stored in seconds) may be
+ used to support timestamp adjustment after inter-timezone transfer.
+ These fields are optional; bit 4 of the flags word controls their
+ presence.
+
+ Charset Short TextEncodingBase (Charset)
+ valid for the following two fields
+
+ FullPath variable Path of the current file.
+ Zero terminated string (C-String)
+ Currently coded in the native Charset.
+
+ Comment variable Finder Comment of the current file.
+ Zero terminated string (C-String)
+ Currently coded in the native Charset.
+
+
+ -SmartZIP Macintosh Extra Field:
+ ====================================
+
+ The following is the layout of the SmartZIP extra
+ block for Macintosh, designed by Marco Bambini.
+
+ Local-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ 0x4d63 Short tag for this extra block type ("cM")
+ TSize Short total data size for this block (64)
+ "dZip" beLong extra-field signature
+ fdType Byte[4] Type of the File (4-byte string)
+ fdCreator Byte[4] Creator of the File (4-byte string)
+ fdFlags beShort Finder Flags
+ fdLocation.v beShort Finder Icon Location
+ fdLocation.h beShort Finder Icon Location
+ fdFldr beShort Folder containing file
+ CrDat beLong HParamBlockRec fileParam.ioFlCrDat
+ MdDat beLong HParamBlockRec fileParam.ioFlMdDat
+ frScroll.v Byte vertical pos. of folder's scroll bar
+ fdScript Byte Script flag and number
+ frScroll.h Byte horizontal pos. of folder's scroll bar
+ fdXFlags Byte More flag bits
+ FileName Byte[32] full Macintosh filename (pascal string)
+
+ All fields but the first two are in native Macintosh format
+ (big-endian Motorola order, not little-endian Intel).
+ The extra field size is fixed to 64 bytes.
+ The local-header and central-header versions are identical.
+
+
+ -Acorn SparkFS Extra Field:
+ =========================
+
+ The following is the layout of David Pilling's SparkFS extra block
+ for Acorn RISC OS. The local-header and central-header versions are
+ identical. (Last Revision 19960922)
+
+ Value Size Description
+ ----- ---- -----------
+ (Acorn) 0x4341 Short tag for this extra block type ("AC")
+ TSize Short total data size for this block (20)
+ "ARC0" Long extra-field signature
+ LoadAddr Long load address or file type
+ ExecAddr Long exec address
+ Attr Long file permissions
+ Zero Long reserved; always zero
+
+ The following bits of Attr are associated with the given file
+ permissions:
+
+ bit 0 user-writable ('W')
+ bit 1 user-readable ('R')
+ bit 2 reserved
+ bit 3 locked ('L')
+ bit 4 publicly writable ('w')
+ bit 5 publicly readable ('r')
+ bit 6 reserved
+ bit 7 reserved
+
+
+ -VM/CMS Extra Field:
+ ==================
+
+ The following is the layout of the file-attributes extra block for
+ VM/CMS. The local-header and central-header versions are
+ identical. (Last Revision 19960922)
+
+ Value Size Description
+ ----- ---- -----------
+ (VM/CMS) 0x4704 Short tag for this extra block type
+ TSize Short total data size for this block
+ flData variable file attributes data
+
+ flData is an uncompressed fldata_t struct.
+
+
+ -MVS Extra Field:
+ ===============
+
+ The following is the layout of the file-attributes extra block for
+ MVS. The local-header and central-header versions are identical.
+ (Last Revision 19960922)
+
+ Value Size Description
+ ----- ---- -----------
+ (MVS) 0x470f Short tag for this extra block type
+ TSize Short total data size for this block
+ flData variable file attributes data
+
+ flData is an uncompressed fldata_t struct.
+
+
+ -PKWARE Unix Extra Field:
+ ========================
+
+ The following is the layout of PKWARE's Unix "extra" block.
+ It was introduced with the release of PKZIP for Unix 2.50.
+ Note: all fields are stored in Intel low-byte/high-byte order.
+ (Last Revision 19980901)
+
+ This field has a minimum data size of 12 bytes and is only stored
+ as local extra field.
+
+ Value Size Description
+ ----- ---- -----------
+ (Unix0) 0x000d Short Tag for this "extra" block type
+ TSize Short Total Data Size for this block
+ AcTime Long time of last access (UTC/GMT)
+ ModTime Long time of last modification (UTC/GMT)
+ UID Short Unix user ID
+ GID Short Unix group ID
+ (var) variable Variable length data field
+
+ The variable length data field will contain file type
+ specific data. Currently the only values allowed are
+ the original "linked to" file names for hard or symbolic
+ links, and the major and minor device node numbers for
+ character and block device nodes. Since device nodes
+ cannot be either symbolic or hard links, only one set of
+ variable length data is stored. Link files will have the
+ name of the original file stored. This name is NOT NULL
+ terminated. Its size can be determined by checking TSize -
+ 12. Device entries will have eight bytes stored as two 4
+ byte entries (in little-endian format). The first entry
+ will be the major device number, and the second the minor
+ device number.
+
+ [Info-ZIP note: The fixed part of this field has the same layout as
+ Info-ZIP's abandoned "Unix1 timestamps & owner ID info" extra field;
+ only the two tag bytes are different.]
+
+
+ -PATCH Descriptor Extra Field:
+ ============================
+
+ The following is the layout of the Patch Descriptor "extra"
+ block.
+
+ Note: all fields stored in Intel low-byte/high-byte order.
+
+ Value Size Description
+ ----- ---- -----------
+ (Patch) 0x000f Short Tag for this "extra" block type
+ TSize Short Size of the total "extra" block
+ Version Short Version of the descriptor
+ Flags Long Actions and reactions (see below)
+ OldSize Long Size of the file about to be patched
+ OldCRC Long 32-bit CRC of the file about to be patched
+ NewSize Long Size of the resulting file
+ NewCRC Long 32-bit CRC of the resulting file
+
+
+ Actions and reactions
+
+ Bits Description
+ ---- ----------------
+ 0 Use for autodetection
+ 1 Treat as selfpatch
+ 2-3 RESERVED
+ 4-5 Action (see below)
+ 6-7 RESERVED
+ 8-9 Reaction (see below) to absent file
+ 10-11 Reaction (see below) to newer file
+ 12-13 Reaction (see below) to unknown file
+ 14-15 RESERVED
+ 16-31 RESERVED
+
+ Actions
+
+ Action Value
+ ------ -----
+ none 0
+ add 1
+ delete 2
+ patch 3
+
+ Reactions
+
+ Reaction Value
+ -------- -----
+ ask 0
+ skip 1
+ ignore 2
+ fail 3
+
+
+ -PKCS#7 Store for X.509 Certificates:
+ ===================================
+
+ This field is contains the information about each
+ certificate a file is signed with. This field should only
+ appear in the first central directory record, and will be
+ ignored in any other record.
+
+ Note: all fields stored in Intel low-byte/high-byte order.
+
+ Value Size Description
+ ----- ---- -----------
+ (Store) 0x0014 2 bytes Tag for this "extra" block type
+ SSize 2 bytes Size of the store data
+ SData (variable) Data about the store
+
+ SData
+ Value Size Description
+ ----- ---- -----------
+ Version 2 bytes Version number, 0x0001 for now
+ StoreD (variable) Actual store data
+
+ The StoreD member is suitable for passing as the pbData
+ member of a CRYPT_DATA_BLOB to the CertOpenStore() function
+ in Microsoft's CryptoAPI. The SSize member above will be
+ cbData + 6, where cbData is the cbData member of the same
+ CRYPT_DATA_BLOB. The encoding type to pass to
+ CertOpenStore() should be
+ PKCS_7_ANS_ENCODING | X509_ASN_ENCODING.
+
+
+ -X.509 Certificate ID and Signature for individual file:
+ ======================================================
+
+ This field contains the information about which certificate
+ in the PKCS#7 Store was used to sign the particular file.
+ It also contains the signature data. This field can appear
+ multiple times, but can only appear once per certificate.
+
+ Note: all fields stored in Intel low-byte/high-byte order.
+
+ Value Size Description
+ ----- ---- -----------
+ (CID) 0x0015 2 bytes Tag for this "extra" block type
+ CSize 2 bytes Size of Method
+ Method (variable)
+
+ Method
+ Value Size Description
+ ----- ---- -----------
+ Version 2 bytes Version number, for now 0x0001
+ AlgID 2 bytes Algorithm ID used for signing
+ IDSize 2 bytes Size of Certificate ID data
+ CertID (variable) Certificate ID data
+ SigSize 2 bytes Size of Signature data
+ Sig (variable) Signature data
+
+ CertID
+ Value Size Description
+ ----- ---- -----------
+ Size1 4 bytes Size of CertID, should be (IDSize - 4)
+ Size1 4 bytes A bug in version one causes this value
+ to appear twice.
+ IssSize 4 bytes Issuer data size
+ Issuer (variable) Issuer data
+ SerSize 4 bytes Serial Number size
+ Serial (variable) Serial Number data
+
+ The Issuer and IssSize members are suitable for creating a
+ CRYPT_DATA_BLOB to be the Issuer member of a CERT_INFO
+ struct. The Serial and SerSize members would be the
+ SerialNumber member of the same CERT_INFO struct. This
+ struct would be used to find the certificate in the store
+ the file was signed with. Those structures are from the MS
+ CryptoAPI.
+
+ Sig and SigSize are the actual signature data and size
+ generated by signing the file with the MS CryptoAPI using a
+ hash created with the given AlgID.
+
+
+ -X.509 Certificate ID and Signature for central directory:
+ ========================================================
+
+ This field contains the information about which certificate
+ in the PKCS#7 Store was used to sign the central directory.
+ It should only appear with the first central directory
+ record, along with the store. The data structure is the
+ same as the CID, except that SigSize will be 0, and there
+ will be no Sig member.
+
+ This field is also kept after the last central directory
+ record, as the signature data (ID 0x05054b50, it looks like
+ a central directory record of a different type). This
+ second copy of the data is the Signature Data member of the
+ record, and will have a SigSize that is non-zero, and will
+ have Sig data.
+
+ Note: all fields stored in Intel low-byte/high-byte order.
+
+ Value Size Description
+ ----- ---- -----------
+ (CDID) 0x0016 2 bytes Tag for this "extra" block type
+ CSize 2 bytes Size of Method
+ Method (variable)
+
+
+ -ZIP64 Extended Information Extra Field:
+ ======================================
+
+ The following is the layout of the ZIP64 extended
+ information "extra" block. If one of the size or
+ offset fields in the Local or Central directory
+ record is too small to hold the required data,
+ a ZIP64 extended information record is created.
+ The order of the fields in the ZIP64 extended
+ information record is fixed, but the fields will
+ only appear if the corresponding Local or Central
+ directory record field is set to 0xFFFF or 0xFFFFFFFF.
+
+ Note: all fields stored in Intel low-byte/high-byte order.
+
+ Value Size Description
+ ----- ---- -----------
+ (ZIP64) 0x0001 2 bytes Tag for this "extra" block type
+ Size 2 bytes Size of this "extra" block
+ Original
+ Size 8 bytes Original uncompresseed file size
+ Compressed
+ Size 8 bytes Size of compressed data
+ Relative Header
+ Offset 8 bytes Offset of local header record
+ Disk Start
+ Number 4 bytes Number of the disk on which
+ this file starts
+
+ This entry in the Local header must include BOTH original
+ and compressed file sizes.
+
+
+ -Extended Timestamp Extra Field:
+ ==============================
+
+ The following is the layout of the extended-timestamp extra block.
+ (Last Revision 19970118)
+
+ Local-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (time) 0x5455 Short tag for this extra block type ("UT")
+ TSize Short total data size for this block
+ Flags Byte info bits
+ (ModTime) Long time of last modification (UTC/GMT)
+ (AcTime) Long time of last access (UTC/GMT)
+ (CrTime) Long time of original creation (UTC/GMT)
+
+ Central-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (time) 0x5455 Short tag for this extra block type ("UT")
+ TSize Short total data size for this block
+ Flags Byte info bits (refers to local header!)
+ (ModTime) Long time of last modification (UTC/GMT)
+
+ The central-header extra field contains the modification time only,
+ or no timestamp at all. TSize is used to flag its presence or
+ absence. But note:
+
+ If "Flags" indicates that Modtime is present in the local header
+ field, it MUST be present in the central header field, too!
+ This correspondence is required because the modification time
+ value may be used to support trans-timezone freshening and
+ updating operations with zip archives.
+
+ The time values are in standard Unix signed-long format, indicating
+ the number of seconds since 1 January 1970 00:00:00. The times
+ are relative to Coordinated Universal Time (UTC), also sometimes
+ referred to as Greenwich Mean Time (GMT). To convert to local time,
+ the software must know the local timezone offset from UTC/GMT.
+
+ The lower three bits of Flags in both headers indicate which time-
+ stamps are present in the LOCAL extra field:
+
+ bit 0 if set, modification time is present
+ bit 1 if set, access time is present
+ bit 2 if set, creation time is present
+ bits 3-7 reserved for additional timestamps; not set
+
+ Those times that are present will appear in the order indicated, but
+ any combination of times may be omitted. (Creation time may be
+ present without access time, for example.) TSize should equal
+ (1 + 4*(number of set bits in Flags)), as the block is currently
+ defined. Other timestamps may be added in the future.
+
+
+ -Info-ZIP Unix Extra Field (type 1):
+ ==================================
+
+ The following is the layout of the old Info-ZIP extra block for
+ Unix. It has been replaced by the extended-timestamp extra block
+ (0x5455) and the Unix type 2 extra block (0x7855).
+ (Last Revision 19970118)
+
+ Local-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (Unix1) 0x5855 Short tag for this extra block type ("UX")
+ TSize Short total data size for this block
+ AcTime Long time of last access (UTC/GMT)
+ ModTime Long time of last modification (UTC/GMT)
+ UID Short Unix user ID (optional)
+ GID Short Unix group ID (optional)
+
+ Central-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (Unix1) 0x5855 Short tag for this extra block type ("UX")
+ TSize Short total data size for this block
+ AcTime Long time of last access (GMT/UTC)
+ ModTime Long time of last modification (GMT/UTC)
+
+ The file access and modification times are in standard Unix signed-
+ long format, indicating the number of seconds since 1 January 1970
+ 00:00:00. The times are relative to Coordinated Universal Time
+ (UTC), also sometimes referred to as Greenwich Mean Time (GMT). To
+ convert to local time, the software must know the local timezone
+ offset from UTC/GMT. The modification time may be used by non-Unix
+ systems to support inter-timezone freshening and updating of zip
+ archives.
+
+ The local-header extra block may optionally contain UID and GID
+ info for the file. The local-header TSize value is the only
+ indication of this. Note that Unix UIDs and GIDs are usually
+ specific to a particular machine, and they generally require root
+ access to restore.
+
+ This extra field type is obsolete, but it has been in use since
+ mid-1994. Therefore future archiving software should continue to
+ support it. Some guidelines:
+
+ An archive member should either contain the old "Unix1"
+ extra field block or the new extra field types "time" and/or
+ "Unix2".
+
+ If both the old "Unix1" block type and one or both of the new
+ block types "time" and "Unix2" are found, the "Unix1" block
+ should be considered invalid and ignored.
+
+ Unarchiving software should recognize both old and new extra
+ field block types, but the info from new types overrides the
+ old "Unix1" field.
+
+ Archiving software should recognize "Unix1" extra fields for
+ timestamp comparison but never create it for updated, freshened
+ or new archive members. When copying existing members to a new
+ archive, any "Unix1" extra field blocks should be converted to
+ the new "time" and/or "Unix2" types.
+
+
+ -Info-ZIP Unix Extra Field (type 2):
+ ==================================
+
+ The following is the layout of the new Info-ZIP extra block for
+ Unix. (Last Revision 19960922)
+
+ Local-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (Unix2) 0x7855 Short tag for this extra block type ("Ux")
+ TSize Short total data size for this block (4)
+ UID Short Unix user ID
+ GID Short Unix group ID
+
+ Central-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (Unix2) 0x7855 Short tag for this extra block type ("Ux")
+ TSize Short total data size for this block (0)
+
+ The data size of the central-header version is zero; it is used
+ solely as a flag that UID/GID info is present in the local-header
+ extra field. If additional fields are ever added to the local
+ version, the central version may be extended to indicate this.
+
+ Note that Unix UIDs and GIDs are usually specific to a particular
+ machine, and they generally require root access to restore.
+
+
+ -ASi Unix Extra Field:
+ ====================
+
+ The following is the layout of the ASi extra block for Unix. The
+ local-header and central-header versions are identical.
+ (Last Revision 19960916)
+
+ Value Size Description
+ ----- ---- -----------
+ (Unix3) 0x756e Short tag for this extra block type ("nu")
+ TSize Short total data size for this block
+ CRC Long CRC-32 of the remaining data
+ Mode Short file permissions
+ SizDev Long symlink'd size OR major/minor dev num
+ UID Short user ID
+ GID Short group ID
+ (var.) variable symbolic link filename
+
+ Mode is the standard Unix st_mode field from struct stat, containing
+ user/group/other permissions, setuid/setgid and symlink info, etc.
+
+ If Mode indicates that this file is a symbolic link, SizDev is the
+ size of the file to which the link points. Otherwise, if the file
+ is a device, SizDev contains the standard Unix st_rdev field from
+ struct stat (includes the major and minor numbers of the device).
+ SizDev is undefined in other cases.
+
+ If Mode indicates that the file is a symbolic link, the final field
+ will be the name of the file to which the link points. The file-
+ name length can be inferred from TSize.
+
+ [Note that TSize may incorrectly refer to the data size not counting
+ the CRC; i.e., it may be four bytes too small.]
+
+
+ -BeOS Extra Field:
+ ================
+
+ The following is the layout of the file-attributes extra block for
+ BeOS. (Last Revision 19970531)
+
+ Local-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (BeOS) 0x6542 Short tag for this extra block type ("Be")
+ TSize Short total data size for this block
+ BSize Long uncompressed file attribute data size
+ Flags Byte info bits
+ (CType) Short compression type
+ (CRC) Long CRC value for uncompressed file attribs
+ Attribs variable file attribute data
+
+ Central-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (BeOS) 0x6542 Short tag for this extra block type ("Be")
+ TSize Short total data size for this block (5)
+ BSize Long size of uncompr. local EF block data
+ Flags Byte info bits
+
+ The least significant bit of Flags in both headers indicates whether
+ the LOCAL extra field is uncompressed (and therefore whether CType
+ and CRC are omitted):
+
+ bit 0 if set, Attribs is uncompressed (no CType, CRC)
+ bits 1-7 reserved; if set, assume error or unknown data
+
+ Currently the only supported compression types are deflated (type 8)
+ and stored (type 0); the latter is not used by Info-ZIP's Zip but is
+ supported by UnZip.
+
+ Attribs is a BeOS-specific block of data in big-endian format with
+ the following structure (if compressed, uncompress it first):
+
+ Value Size Description
+ ----- ---- -----------
+ Name variable attribute name (null-terminated string)
+ Type Long attribute type (32-bit unsigned integer)
+ Size Long Long data size for this sub-block (64 bits)
+ Data variable attribute data
+
+ The attribute structure is repeated for every attribute. The Data
+ field may contain anything--text, flags, bitmaps, etc.
+
+
+ -SMS/QDOS Extra Field:
+ ====================
+
+ The following is the layout of the file-attributes extra block for
+ SMS/QDOS. The local-header and central-header versions are identical.
+ (Last Revision 19960929)
+
+ Value Size Description
+ ----- ---- -----------
+ (QDOS) 0xfb4a Short tag for this extra block type
+ TSize Short total data size for this block
+ LongID Long extra-field signature
+ (ExtraID) Long additional signature/flag bytes
+ QDirect 64 bytes qdirect structure
+
+ LongID may be "QZHD" or "QDOS". In the latter case, ExtraID will
+ be present. Its first three bytes are "02\0"; the last byte is
+ currently undefined.
+
+ QDirect contains the file's uncompressed directory info (qdirect
+ struct). Its elements are in native (big-endian) format:
+
+ d_length beLong file length
+ d_access byte file access type
+ d_type byte file type
+ d_datalen beLong data length
+ d_reserved beLong unused
+ d_szname beShort size of filename
+ d_name 36 bytes filename
+ d_update beLong time of last update
+ d_refdate beLong file version number
+ d_backup beLong time of last backup (archive date)
+
+
+ -AOS/VS Extra Field:
+ ==================
+
+ The following is the layout of the extra block for Data General
+ AOS/VS. The local-header and central-header versions are identical.
+ (Last Revision 19961125)
+
+ Value Size Description
+ ----- ---- -----------
+ (AOSVS) 0x5356 Short tag for this extra block type ("VS")
+ TSize Short total data size for this block
+ "FCI\0" Long extra-field signature
+ Version Byte version of AOS/VS extra block (10 = 1.0)
+ Fstat variable fstat packet
+ AclBuf variable raw ACL data ($MXACL bytes)
+
+ Fstat contains the file's uncompressed fstat packet, which is one of
+ the following:
+
+ normal fstat packet (P_FSTAT struct)
+ DIR/CPD fstat packet (P_FSTAT_DIR struct)
+ unit (device) fstat packet (P_FSTAT_UNIT struct)
+ IPC file fstat packet (P_FSTAT_IPC struct)
+
+ AclBuf contains the raw ACL data; its length is $MXACL.
+
+
+ -Tandem NSK Extra Field:
+ ======================
+
+ The following is the layout of the file-attributes extra block for
+ Tandem NSK. The local-header and central-header versions are
+ identical. (Last Revision 19981221)
+
+ Value Size Description
+ ----- ---- -----------
+ (TA) 0x4154 Short tag for this extra block type ("TA")
+ TSize Short total data size for this block (20)
+ NSKattrs 20 Bytes NSK attributes
+
+
+ -THEOS Extra Field:
+ =================
+
+ The following is the layout of the file-attributes extra block for
+ Theos. The local-header and central-header versions are identical.
+ (Last Revision 19990206)
+
+ Value Size Description
+ ----- ---- -----------
+ (Theos) 0x6854 Short 'Th' signature
+ size Short size of extra block
+ flags Byte reserved for future use
+ filesize Long file size
+ fileorg Byte type of file (see below)
+ keylen Short key length for indexed and keyed files,
+ data segment size for 16 bits programs
+ reclen Short record length for indexed,keyed and direct,
+ text segment size for 16 bits programs
+ filegrow Byte growing factor for indexed,keyed and direct
+ protect Byte protections (see below)
+ reserved Short reserved for future use
+
+ File types
+ ==========
+
+ 0x80 library (keyed access list of files)
+ 0x40 directory
+ 0x10 stream file
+ 0x08 direct file
+ 0x04 keyed file
+ 0x02 indexed file
+ 0x0e reserved
+ 0x01 16 bits real mode program (obsolete)
+ 0x21 16 bits protected mode program
+ 0x41 32 bits protected mode program
+
+ Protection codes
+ ================
+
+ User protection
+ ---------------
+ 0x01 non readable
+ 0x02 non writable
+ 0x04 non executable
+ 0x08 non erasable
+
+ Other protection
+ ----------------
+ 0x10 non readable
+ 0x20 non writable
+ 0x40 non executable Theos before 4.0
+ 0x40 modified Theos 4.x
+ 0x80 not hidden
+
+
+ -THEOS old inofficial Extra Field:
+ ================================
+
+ The following is the layout of an inoffical former version of a
+ Theos file-attributes extra blocks. This layout was never published
+ and is no longer created. However, UnZip can optionally support it
+ when compiling with the option flag OLD_THEOS_EXTRA defined.
+ Both the local-header and central-header versions are identical.
+ (Last Revision 19990206)
+
+ Value Size Description
+ ----- ---- -----------
+ (THS0) 0x4854 Short 'TH' signature
+ size Short size of extra block
+ flags Short reserved for future use
+ filesize Long file size
+ reclen Short record length for indexed,keyed and direct,
+ text segment size for 16 bits programs
+ keylen Short key length for indexed and keyed files,
+ data segment size for 16 bits programs
+ filegrow Byte growing factor for indexed,keyed and direct
+ reserved 3 Bytes reserved for future use
+
+
+ -FWKCS MD5 Extra Field:
+ =====================
+
+ The FWKCS Contents_Signature System, used in automatically
+ identifying files independent of filename, optionally adds
+ and uses an extra field to support the rapid creation of
+ an enhanced contents_signature.
+ There is no local-header version; the following applies
+ only to the central header. (Last Revision 19961207)
+
+ Central-header version:
+
+ Value Size Description
+ ----- ---- -----------
+ (MD5) 0x4b46 Short tag for this extra block type ("FK")
+ TSize Short total data size for this block (19)
+ "MD5" 3 bytes extra-field signature
+ MD5hash 16 bytes 128-bit MD5 hash of uncompressed data
+ (low byte first)
+
+ When FWKCS revises a .ZIP file central directory to add
+ this extra field for a file, it also replaces the
+ central directory entry for that file's uncompressed
+ file length with a measured value.
+
+ FWKCS provides an option to strip this extra field, if
+ present, from a .ZIP file central directory. In adding
+ this extra field, FWKCS preserves .ZIP file Authenticity
+ Verification; if stripping this extra field, FWKCS
+ preserves all versions of AV through PKZIP version 2.04g.
+
+ FWKCS, and FWKCS Contents_Signature System, are
+ trademarks of Frederick W. Kantor.
+
+ (1) R. Rivest, RFC1321.TXT, MIT Laboratory for Computer
+ Science and RSA Data Security, Inc., April 1992.
+ ll.76-77: "The MD5 algorithm is being placed in the
+ public domain for review and possible adoption as a
+ standard."
+
+
+ -Info-ZIP Unicode Path Extra Field:
+ =================================
+
+ Stores the UTF-8 version of the entry path as stored in the
+ local header and central directory header.
+ (Last Revision 20070912)
+
+ Value Size Description
+ ----- ---- -----------
+ (UPath) 0x7075 Short tag for this extra block type ("up")
+ TSize Short total data size for this block
+ Version 1 byte version of this extra field, currently 1
+ NameCRC32 4 bytes File Name Field CRC32 Checksum
+ UnicodeName Variable UTF-8 version of the entry File Name
+
+ Currently Version is set to the number 1. If there is a need
+ to change this field, the version will be incremented. Changes
+ may not be backward compatible so this extra field should not be
+ used if the version is not recognized.
+
+ The NameCRC32 is the standard zip CRC32 checksum of the File Name
+ field in the header. This is used to verify that the header
+ File Name field has not changed since the Unicode Path extra field
+ was created. This can happen if a utility renames the entry but
+ does not update the UTF-8 path extra field. If the CRC check fails,
+ this UTF-8 Path Extra Field should be ignored and the File Name field
+ in the header used instead.
+
+ The UnicodeName is the UTF-8 version of the contents of the File Name
+ field in the header. As UnicodeName is defined to be UTF-8, no UTF-8
+ byte order mark (BOM) is used. The length of this field is determined
+ by subtracting the size of the previous fields from TSize. If both
+ the File Name and Comment fields are UTF-8, the new General Purpose
+ Bit Flag, bit 11 (Language encoding flag (EFS)), can be used to
+ indicate that both the header File Name and Comment fields are UTF-8
+ and, in this case, the Unicode Path and Unicode Comment extra fields
+ are not needed and should not be created. Note that, for backward
+ compatibility, bit 11 should only be used if the native character set
+ of the paths and comments being zipped up are already in UTF-8. The
+ same method, either bit 11 or extra fields, should be used in both
+ the local and central directory headers.
+
+
+ -Info-ZIP Unicode Comment Extra Field:
+ ====================================
+
+ Stores the UTF-8 version of the entry comment as stored in the
+ central directory header.
+ (Last Revision 20070912)
+
+ Value Size Description
+ ----- ---- -----------
+ (UCom) 0x6375 Short tag for this extra block type ("uc")
+ TSize Short total data size for this block
+ Version 1 byte version of this extra field, currently 1
+ ComCRC32 4 bytes Comment Field CRC32 Checksum
+ UnicodeCom Variable UTF-8 version of the entry comment
+
+ Currently Version is set to the number 1. If there is a need
+ to change this field, the version will be incremented. Changes
+ may not be backward compatible so this extra field should not be
+ used if the version is not recognized.
+
+ The ComCRC32 is the standard zip CRC32 checksum of the Comment
+ field in the central directory header. This is used to verify that
+ the comment field has not changed since the Unicode Comment extra field
+ was created. This can happen if a utility changes the Comment field
+ but does not update the UTF-8 Comment extra field. If the CRC check
+ fails, this Unicode Comment extra field should be ignored and the
+ Comment field in the header used.
+
+ The UnicodeCom field is the UTF-8 version of the entry comment field
+ in the header. As UnicodeCom is defined to be UTF-8, no UTF-8 byte
+ order mark (BOM) is used. The length of this field is determined by
+ subtracting the size of the previous fields from TSize. If both the
+ File Name and Comment fields are UTF-8, the new General Purpose Bit
+ Flag, bit 11 (Language encoding flag (EFS)), can be used to indicate
+ both the header File Name and Comment fields are UTF-8 and, in this
+ case, the Unicode Path and Unicode Comment extra fields are not
+ needed and should not be created. Note that, for backward
+ compatibility, bit 11 should only be used if the native character set
+ of the paths and comments being zipped up are already in UTF-8. The
+ same method, either bit 11 or extra fields, should be used in both
+ the local and central directory headers.
+
+
+ -Info-ZIP New Unix Extra Field:
+ ====================================
+
+ Currently stores Unix UIDs/GIDs up to 32 bits.
+ (Last Revision 20080509)
+
+ Value Size Description
+ ----- ---- -----------
+ (UnixN) 0x7875 Short tag for this extra block type ("ux")
+ TSize Short total data size for this block
+ Version 1 byte version of this extra field, currently 1
+ UIDSize 1 byte Size of UID field
+ UID Variable UID for this entry
+ GIDSize 1 byte Size of GID field
+ GID Variable GID for this entry
+
+ Currently Version is set to the number 1. If there is a need
+ to change this field, the version will be incremented. Changes
+ may not be backward compatible so this extra field should not be
+ used if the version is not recognized.
+
+ UIDSize is the size of the UID field in bytes. This size should
+ match the size of the UID field on the target OS.
+
+ UID is the UID for this entry in standard little endian format.
+
+ GIDSize is the size of the GID field in bytes. This size should
+ match the size of the GID field on the target OS.
+
+ GID is the GID for this entry in standard little endian format.
+
+ If both the old 16-bit Unix extra field (tag 0x7855, Info-ZIP Unix)
+ and this extra field are present, the values in this extra field
+ supercede the values in that extra field.
diff --git a/proginfo/fileinfo.cms b/proginfo/fileinfo.cms
new file mode 100644
index 0000000..9d21935
--- /dev/null
+++ b/proginfo/fileinfo.cms
@@ -0,0 +1,231 @@
+[Quoting from a C/370 manual, courtesy of Carl Forde.]
+
+ C/370 supports three types of input and output: text streams, binary
+ streams, and record I/O. Text and binary streams are both ANSI
+ standards; record I/O is a C/370 extension.
+
+[...]
+
+ Record I/O is a C/370 extension to the ANSI standard. For files
+ opened in record format, C/370 reads and writes one record at a
+ time. If you try to write more data to a record than the record
+ can hold, the data is truncated. For record I/O, C/370 only allows
+ the use of fread() and fwrite() to read and write to the files. Any
+ other functions (such as fprintf(), fscanf(), getc(), and putc())
+ fail. For record-orientated files, records do not change size when
+ you update them. If the new data has fewer characters than the
+ original record, the new data fills the first n characters, where
+ n is the number of characters of the new data. The record will
+ remain the same size, and the old characters (those after) n are
+ left unchanged. A subsequent update begins at the next boundary.
+ For example, if you have the string "abcdefgh":
+
+ abcdefgh
+
+ and you overwrite it with the string "1234", the record will look
+ like this:
+
+ 1234efgh
+
+ C/370 record I/O is binary. That is, it does not interpret any of
+ the data in a record file and therefore does not recognize control
+ characters.
+
+
+ The record model consists of:
+
+ * A record, which is the unit of data transmitted to and from a
+ program
+ * A block, which is the unit of data transmitted to and from a
+ device. Each block may contain one or more records.
+
+ In the record model of I/O, records and blocks have the following
+ attributes:
+
+ RECFM Specifies the format of the data or how the data is organized
+ on the physical device.
+ LRECL Specifies the length of logical records (as opposed to
+ physical ones).
+
+ BLKSIZE Specifies the length of physical records (blocks on the
+ physical device).
+
+
+ Opening a File by Filename
+
+ The filename that you specify on the call to fopen() or freopen()
+ must be in the following format:
+
+ >> ----filename---- ----filetype--------------------
+ | | | |
+ --.-- -- --filemode--
+ | |
+ --.--
+ where
+
+ filename is a 1- to 8-character string of any of the characters,
+ A-Z, a-z, 0-9, and +, -, $, #, @, :, and _. You can separate it
+ from the filetype with one or more spaces, or with a period.
+ [Further note: filenames are fully case-sensitive, as in Unix.]
+
+ filetype is a 1- to 8-character string of any of the characters,
+ A-Z, a-z, 0-9, and +, -, $, #, @, :, and _. You can separate it
+ from the filemode with one or more spaces, or with a period. The
+ separator between filetype and filemode must be the same as the
+ one between filename and filetype.
+
+ filemode is a 1- to 2-character string. The first must be any of
+ the characters A-Z, a-z, or *. If you use the asis parameter on
+ the fopen() or freopen() call, the first character of the filemode
+ must be a capital letter or an asterisk. Otherwise, the function
+ call fails. The second character of filemode is optional; if you
+ specify it, it must be any of the digits 0-6. You cannot specify
+ the second character if you have specified * for the first one.
+
+ If you do not use periods as separators, there is no limit to how
+ much whitespace you can have before and after the filename, the
+ filetype, and filemode.
+
+
+ Opening a File without a File Mode Specified
+
+ If you omit the file mode or specify * for it, C/370 does one
+ of the following when you call fopen() or freopen():
+
+ * If you have specified a read mode, C/370 looks for the named file
+ on all the accessed readable disks, in order. If it does not find
+ the file, the fopen() or freopen() call fails.
+ * If you have specified any of the write modes, C/370 writes the file
+ on the first writable disk you have accessed. Specifying a write
+ mode on an fopen() or freopen() call that contains the filename of
+ an existing file destroys that file. If you do not have any
+ writable disks accessed, the call fails.
+
+
+ fopen() and freopen() parameters
+
+ recfm
+ CMS supports only two RECFMs, V and F. [note that MVS supports
+ 27(!) different RECFMs.] If you do not specify the RECFM for a
+ file, C/370 determines whether is is in fixed or variable format.
+
+ lrecl and blksize
+ For files in fixed format, CMS allows records to be read and
+ written in blocks. To have a fixed format CMS file treated as a
+ fixed blocked CMS file, you can open the file with recfm=fb and
+ specify the lrecl and blksize. If you do not specify a recfm on
+ the open, the blksize can be a multiple of the lrecl, and the
+ file is treated as if it were blocked.
+
+ For files in variable format, the CMS LRECL is different from the
+ LRECL for the record model. In the record model, the LRECL is
+ equal to the data length plus 4 bytes (for the record descriptor
+ word), and the BLKSIZE is equal to the LRECL plus 4 bytes (for
+ the block descriptor word). In CMS, BDWs and RDWs do not exist,
+ but because CMS follows the record model, you must still account
+ for them. When you specify V, you must still allocate the record
+ descriptor word and block descriptor word. That is, if you want
+ a maximum of n bytes per record, you must specify a minimum LRECL
+ of n+4 and a minimum BLKSIZE of n+8.
+
+ When you are appending to V files, you can enlarge the record size
+ dynamically, but only if you have not specified LRECL or BLKSIZE
+ on the fopen() or freopen() command that opened the file.
+
+ type
+ If you specify this parameter, the only valid value for CMS disk
+ files is type =record. This opens a file for record I/O.
+
+ asis
+ If you use this parameter, you can open files with mixed-case
+ filenames such as JaMeS dAtA or pErCy.FILE. If you specify this
+ parameter, the file mode that you specify must be a capital letter
+ (if it is not an asterisk); otherwise; the function call fails and
+ the value returned is NULL.
+
+
+ Reading from Record I/O Files
+ fread() is the only interface allowed for reading record I/O files.
+ Each time you call fread() for a record I/O file, fread() reads
+ one record from the system. If you call fread() with a request for
+ less than a complete record, the requested bytes are copied to your
+ buffer, and the file position is set to the start fo the next
+ record. If the request is for more bytes that are in the record,
+ one record is read and the position is set to the start of the next
+ record. C/370 does not strip any blank characters or interpret any
+ data.
+
+ fread() returns the number of items read successfully, so if you
+ pass a size argument equal to 1 and a count argument equal to the
+ maximum expected length of the record, fread() returns the length,
+ in bytes, of the record read. If you pass a size argument equal
+ to the maximum expected length of the record, and a count argument
+ equal to 1, fread() returns either 0 or 1, indicating whether a
+ record of length size read. If a record is read successfully but
+ is less than size bytes long, fread() returns 0.
+
+
+ Writing to Record I/O Files
+ fwrite() is the only interface allowed for writing to a file
+ opened for record I/O. Only one record is written at a time. If
+ you attempt to write more new data than a full record can hold or
+ try to update a record with more data than it currently has, C/370
+ truncates your output at the record boundary. When C/370 performs
+ a truncation, it sets errno and raises SIGIOERR, if SIGIOERR is not
+ set to SIG_IGN.
+
+ When you are writing new records to a fixed-record I/O file, if you
+ try to write a short record, C/370 pads the record with nulls out
+ to LRECL.
+
+ At the completion of an fwrite(), the file position is at the start
+ of the next record. For new data, the block is flushed out to the
+ system as soon as it is full.
+
+
+ fldata() Behavior
+ When you call the fldata() function for an open CMS minidisk file,
+ it returns a data structure that looks like this:
+
+ struct __filedata {
+ unsigned int __recfmF : 1, /* fixed length records */
+ __recfmV : 1, /* variable length records */
+ __recfmU : 1, /* n/a */
+ __recfmS : 1, /* n/a */
+ __recfmBlk : 1, /* n/a */
+ __recfmASA : 1, /* text mode and ASA */
+ __recfmM : 1, /* n/a */
+ __dsorgPO : 1, /* n/a */
+ __dsorgPDSmem : 1, /* n/a */
+ __dsorgPDSdir : 1, /* n/a */
+ __dsorgPS : 1, /* sequential data set */
+ __dsorgConcat : 1, /* n/a */
+ __dsorgMem : 1, /* n/a */
+ __dsorgHiper : 1, /* n/a */
+ __dsorgTemp : 1, /* created with tmpfile() */
+ __dsorgVSAM : 1, /* n/a */
+ __reserve1 : 1, /* n/a */
+ __openmode : 2, /* see below 1 */
+ __modeflag : 4, /* see below 2 */
+ __reserve2 : 9, /* n/a */
+
+ char __device; __DISK
+ unsigned long __blksize, /* see below 3 */
+ __maxreclen; /* see below 4 */
+ unsigned short __vsamtype; /* n/a */
+ unsigned long __vsamkeylen; /* n/a */
+ unsigned long __vsamRKP; /* n/a */
+ char * __dsname; /* fname ftype fmode */
+ unsigned int __reserve4; /* n/a */
+
+ /* note 1: values are: __TEXT, __BINARY, __RECORD
+ note 2: values are: __READ, __WRITE, __APPEND, __UPDATE
+ these values can be added together to determine
+ the return value; for example, a file opened with
+ a+ will have the value __READ + __APPEND.
+ note 3: total block size of the file, including ASA
+ characters as well as RDW information
+ note 4: maximum record length of the data only (includes
+ ASA characters but excludes RDW information).
+ */
+ };
diff --git a/proginfo/infozip.who b/proginfo/infozip.who
new file mode 100644
index 0000000..994851c
--- /dev/null
+++ b/proginfo/infozip.who
@@ -0,0 +1,242 @@
+These members of the Info-ZIP group contributed to the development and
+testing of portable Zip. They are responsible for whatever works in
+Zip. Whatever doesn't work is solely the fault of the authors of Zip
+(Mark Adler, Rich Wales, Jean-loup Gailly, Kai Uwe Rommel, Igor Mandrichenko,
+Onno van der Linden, Christian Spieler, John Bush, Paul Kienitz, Sergio Monesi
+and Karl Davis, but see the license for the latest list). If you have
+contributed and your name has been forgotten, please send a reminder to us
+using the contact information in the Readme file. The names are given here
+in alphabetical order, because it's impossible to classify them by importance
+of the contribution. Some have made a complete port to a new target, some
+have provided a one line fix. All are to be thanked.
+
+
+Mark Adler madler@tybalt.caltech.edu NeXT 2.x, Mac
+ alan@spri.levels.unisa.edu.au Linux
+Jeffrey Altman jaltman@watsun.cc.columbia.edu fseek bug on NT
+Glenn J. Andrews oper1%drcv06.decnet@drcvax.af.mil VAX VMS
+James Van Artsdalen james@raid.dell.com bug report
+Eric Backus ericb@lsid.hp.com bug report
+Quentin Barnes qbarnes@urbana.css.mot.com unix/Makefile mode of
+ installed files
+Elmar Bartel bartel@informatik.tu-muenchen.de
+Mark E. Becker mbecker@cs.uml.edu bug report
+Paul von Behren Paul_von_Behren@stortek.com OS/390 port
+Jon Bell swy@wsdot.wa.gov Intergraph/CLIX
+Myles Bennett - Initial UnZip 6.0 large
+ files beta
+Michael Bernardi mike@childsoc.demon.co.uk RS6000
+Tom Betz marob!upaya!tbetz@phri.nyu.edu SCO Xenix 2.3.1
+James Birdsall jwbirdsa@picarefy.com AT&T 3B1
+George boer@fwi.uva.nl OS/2
+Michael Bolton bolton@vaxc.erim.org VAX/VMS
+Wim Bonner 27313853@WSUVM1.CSC.WSU.EDU HP 9000/840a HPUX
+Paul Borman prb@cray.com Cray-X/YMP,2 UNICOS 6-8
+Kurt Van den Branden kvd2@bipsy.se.bel.alcatel.be VAX VMS
+Scott Briggs briggs@nashua.progress.com Windows NT
+Leslie C. Brown lbrown@BRL.MIL Pyramid MIS-4
+Ralf Brown ralf@b.gp.cs.cmu.edu Pyramid MIS-4
+Rodney Brown rdb@cmutual.com.au SunOS 4.1.3 DGUX OSF/1
+ HP-UX CRC optimization
+Jeremy Daniel Buhler jbuhler@owlnet.rice.edu BC++
+John Bush john.bush@east.sun.com Amiga (SAS/C)
+Pietro Caselli zaphod@petruz.sublink.org Minix 1.5.10
+Andrew A. Chernov ache@astral.msk.su FreeBSD
+Jeff Coffler jeffcof@microsoft.com Windows NT
+David Dachtera David.Dachtera@advocatehealth.com VMS
+ link_zip.com bug
+Bill Davidsen davidsen@crdos1.crd.ge.com Xenix (on what?)
+Karl Davis riscman@geko.com.au Acorn
+Daniel Deimert daniel@pkmab.se zeus3.21 Zilog S8000
+David Denholm denholm@sotona.physics.southampton.ac.uk VMS
+Harald Denker harry@hal.westfalen.de ATARI
+Matthew J. D'Errico doc@magna.com Bull
+L. Peter Deutsch ghost@aladdin.com Linux
+Uwe Doering gemini@geminix.in-berlin.de 386 Unix
+Jean-Michel Dubois jmdubois@ibcfrance.fr Theos support
+James P. Dugal jpd@usl.edu Pyramid 90X OSx4.1
+"Evil Ed" esaffle@gmuvax2.gmu.edu Ultrix-32 V3.1 (Rev. 9)
+Patrick Ellis pellis@aic.mdc.com VMS zip -h appearance
+Thomas Esken esken@uni-muenster.de Acorn fix
+Dwight Estep estep@dlo10.enet.dec.com MSDOS
+David A. Feinleib t-davefe@microsoft.com Windows NT
+Joshua Felsteiner joshua@phys1.technion.ac.il Linux
+Greg Flint afc@klaatu.cc.purdue.edu ETA-10P* hybrid Sys V
+Carl Forde cforde@bcsc02.gov.bc.ca VM/CMS
+Jeff Foy jfoy@glia.biostr.washington.edu IRIX Sys V Rel 3.3.1
+Mike Freeman mikef@pacifier.com Vax VMS
+Kevin M. Fritz kmfritz@apgea.army.mil Turbo C++ 1.0
+ Pyramid
+Jean-loup Gailly jloup@chorus.fr MS-DOS Microsoft C 5.1
+Scott D. Galloway sgallowa@letterkenn-emh1.army.mil Sperry 5000 SysV.3
+Rainer Gerling gerling@faupt101.physik.uni-erlangen.de HPUX, MSDOS
+Henry Gessau henryg@kullmar.kullmar.se Windows NT
+Ed Gordon - Zip 3.0, VB, Unicode,
+ large files, splits, DLLs
+Ian E. Gorman ian@iosphere.net ported zip 2.2 to VM/CMS
+Wayne R. Graves graves@csa2.lbl.gov Vax VMS
+George Grimes grimes@netcom.com Apollo Domain SR10.4
+Hunter Goatley goathunter@MadGoat.com VMS (VAX & Alpha),
+ web and ftp sites
+Arnt Gulbrandsen agulbra@pvv.unit.no Linux
+David Gundlach david@rolf.stat.uga.edu Sun SS1+ SunOS 4.1
+Peter Gutmann pgut1@cs.aukuni.ac.nz bug report
+Dirk Haase d_haase@sitec.de MacOS port
+Mark Hanning-Lee markhl@iris-355.jpl.nasa.gov SGI
+Walter Haidinger e9225662@student.tuwien.ac.at Amiga and general fixes
+Charles Hannum mycroft@ai.mit.edu bug report
+Greg Hartwig ghartwig@ix.netcom.com VM/CMS cleanup
+Tanvir Hassan tanvir.hassan@autodesk.com NT
+Bob Hardy hardy@lucid.com Power C on MSDOS
+Zachary Heilig heilig@plains.nodak.edu Turbo C++ 3.0
+Chris Herborth chrish@pobox.com BeOS port
+Jonathan Hudson jrhudson@bigfoot.com QDOS port
+Mark William Jacobs mark@mensch.stanford.edu MSDOS
+Aubrey Jaffer jaffer@martigny.ai.mit.edu Pixel
+Peter Jones jones.peter@uqam.ca MIPS UMIPS 4.0
+ +Onolimit fix for HP-UX
+Kjetil W. J{\o}rgensen jorgens@lise.unit.no OSF/1, DJGPP v2
+Bruce Kahn bkahn@archive.webo.dg.com MS-DOS Microsoft C 5.1
+Jonathan I. Kamens jik@pit-manager.mit.edu ultrix on DECstation
+Dave Kapalko d.kapalko@att.com bug report
+Bob Kemp Robert.V.Kemp@att.com AT&T 3B2 SysV 3.2v2
+Vivek Khera khera@cs.duke.edu SunOS
+Earl Kiech KIECH@utkvx.utk.edu VAX VMS V5.4-1A
+Paul Kienitz Paul.Kienitz@shelter.sf.ca.us Amiga, Watcom C
+David Kirschbaum kirsch@usasoc.soc.mil He got us all in this
+ mess in the first place
+Thomas Klausner wiz@danbala.tuwien.ac.at cygwin32 and -k fix
+D. Krumbholz krumbh00@marvin.informatik.uni-dortmund.de
+ Acorn filetype and
+ timestamp bug report
+
+Bo Kullmar bk@kullmar.se DNIX 5.3, SunOS 4.1
+Baden Kudrenecky baden@unixg.ubc.ca OS/2
+Giuseppe La Sala lasala@mail.esa.esrin.it VMS
+Jean-Marc Lasgouttes jean-marc.lasgouttes@inria.fr Bug report
+Harry Langenbacher harry@neuron6.Jpl.Nasa.Gov Sun SS1+ SunOS 4.1
+Michael D. Lawler mdlawler@gwmicro.com Mt.Xinu BSD 4.3 on VAX
+ Borland C++ 4.51
+Johnny Lee johnnyl@microsoft.com Microsoft C 7.0
+Michael Lemke michael@io.as.utexas.edu VMS
+David Lemson lemson@ux1.cso.uiuc.edu Sequent Dynix 3.0.17
+Tai-Shan Lin tlin@snakeyes.eecs.wsu.edu OS/2
+Onno van der Linden onno@simplex.nl NetBSD, Borland C++,
+ MSC 7.0, DJGPP 2
+
+Michel loehden%mv13.decnet@vax.hrz.uni-marburg.de VMS
+Warner Losh imp@Solbourne.COM packing algorithm help
+Dave Lovelace davel@grex.cyberspace.org DG AOS/VS
+Erik Luijten erik@tntnhb3.tn.tudelft.nl problem report
+John Lundin lundin@urvax.urich.edu VAX VMS
+Igor Mandrichenko mandrichenko@m10.ihep.su VAX VMS
+Cliff Manis root@csoftec.csf.com SCO 2.3.1 (386)
+Fulvio Marino fulvio@iconet.ico.olivetti.it X/OS 2.3 & 2.4
+Bill Marsh bmarsh@cod.nosc.mil SGI Iris 4D35
+Michael Mauch mauch@gmx.de djgpp LFN attribute fix
+Peter Mauzey ptm@mtdcr.mt.lucent.com AT&T 6300, 7300
+Rafal Z. Maszkowski rzm@mat.torun.edu.pl Convex
+Robert McBroom (?) rm3@ornl.gov DECsystem 5810
+Tom McConnell tmcconne@sedona.intel.com NCR SVR4
+Frank P. McIngvale frankm@eng.auburn.edu Bug report
+Conor McMenamin C.S.McMenamin@sussex.ac.uk MSDOS
+Will Menninger Win32, MinGW
+John Messenger jlm@proteon.com Bug report
+Michael kuch@mailserv.zdv.uni-tuebingen.de SGI
+Dan Mick dmick@pongo.west.sun.com Solaris
+Alan Modra alan@spri.levels.unisa.edu.au Linux
+Laszlo Molnar lmolnar@goliat.eik.bme.hu DJGPP v2
+Jim Mollmann jmq@nccibm1.bitnet OS/2 & MVS
+Sergio Monesi pel0015@cdc8g5.cdc.polimi.it Acorn
+J. Mukherjee jmukherj@ringer.cs.utsa.edu OS/2
+Anthony Naggs amn@ubik.demon.co.uk bug report
+Matti Narkia matti.narkia@ntc.nokia.com VAX VMS
+Rainer Nausedat Zip 3.0, large files
+Robert E. Newman Jr. newmanr@ssl.msfc.nasa.gov bug report
+Robert Nielsen NielsenRJ@ems.com 2.2 -V VMS bug report
+Christian Michel cmichel@de.ibm.com 2.2 check_dup OS/2 bug
+ report
+Thomas S. Opheys opheys@kirk.fmi.uni-passau.de OS/2
+Humberto Ortiz-Zuazaga zuazaga@ucunix.san.uc.edu Linux
+James E. O'Dell jim@fpr.com MacOS
+William O'Shaughnessy williamo@hpcupt1.cup.hp.com HPUX
+Neil Parks neil.parks@pcohio.com MSDOS
+Enrico Renato Palmerini palmer@vxscaq.cineca.it UNISYS 7000 Sys 5 r2.3
+Geoff Pennington Geoff.Pennington@sgcs.co.uk -q output bug
+Keith Petersen w8sdz@simtel20.army.mil Pyramid UCB OSx4.4c
+George Petrov VM/CMS, MVS
+Alan Phillips postmaster@lancaster.ac.uk Dynix/ptx 1.3
+Bruno Pillard bp@chorus.fr SunOS 4.1
+Piet W. Plomp piet@icce.rug.nl MSC 7.0, SCO 3.2v4.0
+John Poltorak j.poltorak@bradford.ac.uk problem report
+Kenneth Porter 72420.2436@compuserve.com OS/2
+Norbert Pueschel pueschel@imsdd.meb.uni-bonn.de Amiga time.lib
+Yuval Rakavy yuval@cs.huji.ac.il MSDOS
+David A Rasmussen dave@convex.csd.uwm.edu Convex C220 with 9.0 OS
+Eric Raymond esr@snark.thyrsus.com Unix
+Jim Read 74312.3103@compuserve.com OS/2
+Michael Regoli mr@cica.indiana.edu Ultrix 3.1 VAX 8650
+ BSD 4.3 IBM RT/125
+ BSD 4.3 MicroVAX 3500
+ SunOS 4.0.3 Sun 4/330
+Jochen Roderburg roderburg@rrz.uni-koeln.de Digital Unix with
+ AFS/NFS converter
+Rick Rodgers rodgers@maxwell.mmwb.ucsf.EDU Unix man page
+Greg Roelofs roe2@midway.uchicago.edu SunOS 4.1.1,4.1.2 Sun 4
+ Unicos 5.1--6.1.5 Cray
+ OS/2 1.3 MS C 6.0
+ Ultrix 4.1,4.2 DEC 5810
+ VMS 5.2, 5.4 VAX 8600
+ Irix 3.3.2, SGI Iris 4D
+ UTS 1.2.4 Amdahl 5880
+Phil Ritzenthaler phil@cgrg.ohio-state.edu SYSV
+Kai Uwe Rommel rommel@ars.de or rommel@leo.org OS/2
+Markus Ruppel m.ruppel@imperial.ac.uk OS/2
+Shimazaki Ryo eririn@ma.mailbank.ne.jp human68k
+Jon Saxton jrs@panix.com Microsoft C 6.0
+Steve Salisbury stevesa@msn.com Microsoft C 8.0
+Timo Salmi ts@uwasa.fi bug report
+Darren Salt ds@youmustbejoking.demon.co.uk RISC OS
+NIIMI Satoshi a01309@cfi.waseda.ac.jp Human68K
+Tom Schmidt tschmidt@micron.com SCO 286
+Martin Schulz martin.schulz@isltd.insignia.com Windows NT, Atari
+Steven Schweda VMS, Unix, large files
+Dan Seyb dseyb@halnet.com AIX
+Mark Shadley shadcat@catcher.com unix fixes
+Timur Shaporev tim@rd.relcom.msk.su MSDOS
+W. T. Sidney sidney@picard.med.ge.com bug report
+Dave Sisson daves@vtcosy.cns.vt.edu AIX 1.1.1 PS/2 & 3090
+Dave Smith smithdt@bp.com Tandem port
+Fred Smith fredex@fcshome.stoneham.ma.us Coherent
+Christian Spieler spieler@ikp.tu-darmstadt.de VMS, MSDOS, emx, djgpp,
+ WIN32, Linux
+Ron Srodawa srodawa@vela.acs.oakland.edu SCO Xenix/386 2.3.3
+Adam Stanley astanley@winternet.com MSDOS
+Bertil Stenstr|m stenis@heron.dafa.se HP-UX 7.0 HP9000/835
+Carl Streeter streeter@oshkoshw.bitnet OS/2
+Reuben Sumner rasumner@undergrad.math.uwaterloo.ca Suggestions
+E-Yen Tan e-yen.tan@brasenose.oxford.ac.uk Borland C++ win32
+Yoshioka Tsuneo tsuneo-y@is.aist-nara.ac.jp Multibyte charset
+ support
+Paul Telles paul@pubnet.com SCO Xenix
+Julian Thompson jrt@oasis.icl.co.uk bug report
+Christopher C. Tjon tjon@plains.nodak.edu bug report
+Robert F Tobler rft@cs.stanford.edu bug report
+Eric Tomio tomio@acri.fr bug report
+Cosmin Truta cosmint@cs.ubbcluj.ro win32 gcc based + asm
+Anthony R. Venson cevens@unix1.sncc.lsu.edu MSDOS/emx
+Antoine Verheijen antoine@sysmail.ucs.ualberta.ca envargs fix
+Arjan de Vet devet@info.win.tue.nl SunOS 4.1, MSC 5.1
+Santiago Vila Doncel sanvila@ba.unex.es MSDOS
+Johan Vromans jv@mh.nl bug report
+Rich Wales wales@cs.ucla.edu SunOS 4.0.3 Sun-3/50
+Scott Walton scottw@io.com BSD/386
+Frank J. Wancho wancho@wsmr-simtel20.army.mil TOPS-20
+ oyvind@stavanger.sgp.slb.com Bug report.
+Takahiro Watanabe wata@first.tsukuba.ac.jp fixes for INSTALL
+Mike White mwhite@pumatech.com wizzip DLL
+Ray Wickert wickert@dc-srv.pa-x.dec.com MSDOS/DJGPP
+Winfried Winkler willi@wap0109.chem.tu-berlin.de AIX
+Norman J. Wong as219@freenet.carleton.ca MSDOS
+Martin Zinser m.zinser@gsi.de VMS 7.x
+
diff --git a/proginfo/ntsd.txt b/proginfo/ntsd.txt
new file mode 100644
index 0000000..8ac31ba
--- /dev/null
+++ b/proginfo/ntsd.txt
@@ -0,0 +1,111 @@
+Info-ZIP portable Zip/UnZip Windows NT security descriptor support
+==================================================================
+Scott Field (sfield@microsoft.com), 8 October 1996
+
+
+This version of Info-ZIP's Win32 code allows for processing of Windows
+NT security descriptors if they were saved in the .zip file using the
+appropriate Win32 Zip running under Windows NT. This also requires
+that the file system that Zip/UnZip operates on supports persistent
+Acl storage. When the operating system is not Windows NT and the
+target file system does not support persistent Acl storage, no security
+descriptor processing takes place.
+
+A Windows NT security descriptor consists of any combination of the
+following components:
+
+ an owner (Sid)
+ a primary group (Sid)
+ a discretionary ACL (Dacl)
+ a system ACL (Sacl)
+ qualifiers for the preceding items
+
+By default, Zip will save all aspects of the security descriptor except
+for the Sacl. The Sacl contains information pertaining to auditing of
+the file, and requires a security privilege be granted to the calling
+user in addition to being enabled by the calling application. In order
+to save the Sacl during Zip, the user must specify the -! switch on the
+Zip commandline. The user must also be granted either the SeBackupPrivilege
+"Backup files and directories" or the SeSystemSecurityPrivilege "Manage
+auditing and security log".
+
+By default, UnZip will not restore any aspects of the security descriptor.
+If the -X option is specified to UnZip, the Dacl is restored to the file.
+The other items in the security descriptor on the new file will receive
+default values. If the -XX option is specified to UnZip, as many aspects
+of the security descriptor as possible will be restored. If the calling
+user is granted the SeRestorePrivilege "Restore files and directories",
+all aspects of the security descriptor will be restored. If the calling
+user is only granted the SeSystemSecurityPrivilege "Manage auditing and
+security log", only the Dacl and Sacl will be restored to the new file.
+
+Note that when operating on files that reside on remote volumes, the
+privileges specified above must be granted to the calling user on that
+remote machine. Currently, there is no way to directly test what privileges
+are present on a remote machine, so Zip and UnZip make a remote privilege
+determination based on an indirect method.
+
+UnZip considerations
+--------------------
+
+In order for file security to be processed correctly, any directory entries
+that have a security descriptor will be processed at the end of the unzip
+cycle. This allows for unzip to process files within the newly created
+directory regardless of the security descriptor associated with the directory
+entry. This also prevents security inheritance problems that can occur as
+a result of creating a new directory and then creating files in that directory
+that will inherit parent directory permissions; such inherited permissions may
+prevent the security descriptor taken from the zip file from being applied
+to the new file.
+
+If directories exist which match directory/extract paths in the .zip file,
+file security is not updated on the target directory. It is assumed that if
+the target directory already exists, then appropriate security has already
+been applied to that directory.
+
+"unzip -t" will test the integrity of stored security descriptors when
+present and the operating system is Windows NT.
+
+ZipInfo (unzip -Z) will display information on stored security descriptor
+when "unzip -Zv" is specifed.
+
+
+Potential uses
+==============
+
+The obvious use for this new support is to better support backup and restore
+operations in a Windows NT environment where NTFS file security is utilized.
+This allows individuals and organizations to archive files in a portable
+fashion and transport these files across the organization.
+
+Another potential use of this support is setup and installation. This
+allows for distribution of Windows NT based applications that have preset
+security on files and directories. For example, prior to creation of the
+.zip file, the user can set file security via File Manager or Explorer on
+the files to be contained in the .zip file. In many cases, it is appropriate
+to only grant Everyone Read access to .exe and .dll files, while granting
+Administrators Full control. Using this support in conjunction with the
+unzipsfx.exe self-extractor stub can yield a useful and powerful way to
+install software with preset security (note that -X or -XX should be
+specified on the self-extractor commandline).
+
+When creating .zip files with security which are intended for transport
+across systems, it is important to take into account the relevance of
+access control entries and the associated Sid of each entry. For example,
+if a .zip file is created on a Windows NT workstation, and file security
+references local workstation user accounts (like an account named Fred),
+this access entry will not be relevant if the .zip file is transported to
+another machine. Where possible, take advantage of the built-in well-known
+groups, like Administrators, Everyone, Network, Guests, etc. These groups
+have the same meaning on any Windows NT machine. Note that the names of
+these groups may differ depending on the language of the installed Windows
+NT, but this isn't a problem since each name has well-known ID that, upon
+restore, translates to the correct group name regardless of locale.
+
+When access control entries contain Sid entries that reference Domain
+accounts, these entries will only be relevant on systems that recognize
+the referenced domain. Generally speaking, the only side effects of
+irrelevant access control entries is wasted space in the stored security
+descriptor and loss of complete intended access control. Such irrelevant
+access control entries will show up as "Account Unknown" when viewing file
+security with File Manager or Explorer.
diff --git a/proginfo/perform.dos b/proginfo/perform.dos
new file mode 100644
index 0000000..98744ee
--- /dev/null
+++ b/proginfo/perform.dos
@@ -0,0 +1,183 @@
+Date: Wed, 27 Mar 1996 01:31:50 CET +0100
+From: Christian Spieler (IKDA, THD, D-64289 Darmstadt)
+Subject: More detailed comparison of MSDOS Info-ZIP programs' performance
+
+Hello all,
+
+In response to some additional questions and requests concerning
+my previous message about DOS performance of 16/32-bit Info-ZIP programs,
+I have produced a more detailed comparison:
+
+System:
+Cx486DX-40, VL-bus, 8MB; IDE hard disk;
+DOS 6.2, HIMEM, EMM386 NOEMS NOVCPI, SMARTDRV 3MB, write back.
+
+I have used the main directory of UnZip 5.20p as source, including the
+objects and executable of an EMX compile for unzip.exe (to supply some
+binary test files).
+
+Tested programs were (my current updated sources!) Zip 2.0w and UnZip 5.20p
+- 16-bit MSC 5.1, compressed with LZEXE 0.91e
+- 32-bit Watcom C 10.5, as supplied by Kai Uwe Rommel (PMODE 1.22)
+- 32-bit EMX 0.9b
+- 32-bit DJGPP v2
+- 32-bit DJGPP v1.12m4
+
+The EMX and DJ1 (GO32) executables were bound with the full extender, to
+create standalone executables.
+
+A) Tests of Zip
+ Command : "<system>\zip.exe -q<#> tes.zip unz/*" (unz/*.* for Watcom!!)
+ where <#> was: 0, 1, 6, 9.
+ The test archive "tes.zip" was never deleted, this test
+ measured "time to update archive".
+
+ The following table contains average execution seconds (averaged over
+ at least 3 runs, with the first run discarted to fill disk cache);
+ numbers in parenteses specify the standard deviation of the last
+ digits.
+
+ cmpr level| 0 | 1 | 6 | 9
+ ===============================================================
+ EMX win95 | 7.77 | 7.97 | 12.82 | 22.31
+ ---------------------------------------------------------------
+ EMX | 7.15(40) | 8.00(6) | 12.52(25) | 20.93
+ DJ2 | 13.50(32) | 14.20(7) | 19.05 | 28.48(9)
+ DJ1 | 13.56(30) | 14.48(3) | 18.70 | 27.43(13)
+ WAT | 6.94(22) | 8.93 | 15.73(34) | 30.25(6)
+ MSC | 5.99(82) | 9.40(4) | 13.59(9) | 20.77(4)
+ ===============================================================
+
+ The "EMX win95" line was created for comparison, to check the performance
+ of emx 0.9 with the RSX extender in a DPMI environment. (This line was
+ produced by applying the "stubbed" EMX executable in a full screen DOS box.)
+
+
+B) Tests of UnZip
+ Commands : <system>\unzip.exe -qt tes.zip (testing performance)
+ <system>\unzip.exe -qo tes.zip -dtm (extracting performance)
+
+ The tes.zip archive created by maximum compression with the Zip test
+ was used as example archive. Contents (archive size was 347783 bytes):
+ 1028492 bytes uncompressed, 337235 bytes compressed, 67%, 85 files
+
+ The extraction directory tm was not deleted between the individual runs,
+ thus this measurement checks the "overwrite all" time.
+
+ | testing | extracting
+ ===================================================================
+ EMX | 1.98 | 6.43(8)
+ DJ2 | 2.09 | 11.85(39)
+ DJ1 | 2.09 | 7.46(9)
+ WAT | 2.42 | 7.10(27)
+ MSC | 4.94 | 9.57(31)
+
+Remarks:
+
+The executables compiled by me were generated with all "performance"
+options enabled (ASM_CRC, and ASMV for Zip), and with full crypt support.
+For DJ1 and DJ2, the GCC options were "-O2 -m486", for EMX "-O -m486".
+
+The Watcom UnZip was compiled with ASM_CRC code enabled as well,
+but the Watcom Zip example was made without any optional assembler code!
+
+
+
+Discussion of the results:
+
+In overall performance, the EMX executables clearly win.
+For UnZip, emx is by far the fastest program, and the Zip performance is
+comparable to the 16-bit "reference".
+
+Whenever "real" work including I/O is requested, the DJGPP versions
+lose badly because of poor I/O performance, this is the case especially
+for the "newer" DJGPP v2 !!!
+(I tried to tweak with the transfer buffer size, but without any success.)
+An interesting result is that DJ v1 UnZip works remarkably better than
+DJ v2 (in contrast to Zip, where both executables' performance is
+approximately equal).
+
+The Watcom C programs show a clear performance deficit in the "computational
+part" (Watcom C compiler produces code that is far from optimal), but
+the extender (which is mostly responsible for the I/O throughput) seems
+to be quite fast.
+
+The "natural" performance deficit of the 16-bit MSC code, which can be
+clearly seen in the "testing task" comparison for UnZip, is (mostly,
+for Zip more than) compensated by the better I/O throughput (due to the
+"direct interface" between "C RTL" and "DOS services", without any mode
+switching).
+
+But performance is only one aspect when choosing which compiler should
+be used for official distribution:
+
+Sizes of the executables:
+ | Zip || UnZip
+ | standalone stub || standalone | stub
+======================================================================
+EMX | 143,364 (1) | 94,212 || 159,748 (1) | 110,596
+DJ2 | 118,272 (2) | -- || 124,928 (2) | --
+DJ1 | 159,744 | 88,064 || 177,152 | 105,472
+WAT | 140,073 | -- || 116,231 | --
+MSC | 49,212 (3) | -- || 45,510 (3) | --
+
+(1) does not run in "DPMI only" environment (Windows DOS box)
+(2) requires externally supplied DPMI server
+(3) compressed with LZexe 0.91
+
+Caveats/Bugs/Problems of the different extenders:
+
+EMX:
+- requires two different extenders to run in all DOS-compatible environments,
+ EMX for "raw/himem/vcpi" and RSX for "dpmi" (Windows).
+- does not properly support time zones (no daylight savings time)
+
+DJv2:
+- requires an external (freely available) DPMI extender when run on plain
+ DOS; this extender cannot (currently ??) be bound into the executable.
+
+DJv1:
+- uses up large amount of "low" dos memory (below 1M) when spawning
+ another program, each instance of a DJv1 program requires its private
+ GO32 extender copy in low dos memory (may be problem for the zip
+ "-T" feature)
+
+Watcom/PMODE:
+- extended memory is allocated statically (default: ALL available memory)
+ This means that a spawned program does not get any extended memory.
+ You can work around this problem by setting a hard limit on the amount
+ of extended memory available to the PMODE program, but this limit is
+ "hard" and restricts the allocatable memory for the program itself.
+ In detail:
+ The Watcom zip.exe as distributed did not allow the "zip -T" feature;
+ there was no extended memory left to spawn unzip.
+ I could work around this problem by applying PMSETUP to change the
+ amount of allocated extended memory to 2.0 MByte (I had 4MB free extended
+ memory on my test system). But, this limit cannot be enlarged at
+ runtime, when zip needs more memory to store "header info" while
+ zipping up a huge drive, and on a system with less free memory, this
+ method is not applicable, either.
+
+Summary:
+
+For Zip:
+Use the 16-bit executable whenever possible (unless you need the
+larger memory capabilities when zipping up a huge amount of files)
+
+As 32-bit executable, we may distribute Watcom C (after we have confirmed
+that enabling ASMV and ASM_CRC give us some better computational
+performance.)
+The alternative for 32-bit remains DJGPP v1, which shows the least problems
+(to my knowledge); v2 and EMX cannot be used because of their lack of
+"universality".
+
+For UnZip:
+Here, the Watcom C 32-bit executable is probably the best compromise,
+but DJ v1 could be used as well.
+And, after all, the 16-bit version does not lose badly when doing
+"real" extraction! For the SFX stub, the 16-bit version remains first
+choice because of its much smaller size!
+
+Best regards
+
+Christian Spieler
diff --git a/proginfo/timezone.txt b/proginfo/timezone.txt
new file mode 100644
index 0000000..7868093
--- /dev/null
+++ b/proginfo/timezone.txt
@@ -0,0 +1,85 @@
+Timezone strings:
+-----------------
+This is a description of valid timezone strings for ENV[ARC]:TZ:
+"XPG3TZ - time zone information"
+The form of the time zone information is based on the XPG3 specification of
+the TZ environment variable. Spaces are allowed only in timezone
+designations, where they are significant. The following description
+closely follows the XPG3 specification, except for the paragraphs starting
+**CLARIFICATION**.
+
+<std><offset>[<dst>[<offset>],<start>[/<time>],<end>[/<time>]]
+
+Where:
+<std> and <dst>
+ Are each three or more bytes that are the designation for the
+ standard (<std>) and daylight savings time (<dst>) timezones.
+ Only <std> is required - if <dst> is missing, then daylight
+ savings time does not apply in this locale. Upper- and
+ lower-case letters are allowed. Any characters except a
+ leading colon (:), digits, a comma (,), a minus (-) or a plus
+ (+) are allowed.
+ **CLARIFICATION** The two-byte designation `UT' is permitted.
+<offset>
+ Indicates the value one must add to the local time to arrive
+ at Coordinated Universal Time. The offset has the form:
+ <hh>[:<mm>[:<ss>]]
+ The minutes <mm> and seconds <ss> are optional. The hour <hh>
+ is required and may be a single digit. The offset following
+ <std> is required. If no offset follows <dst>, daylight savings
+ time is assumed to be one hour ahead of standard time. One or
+ more digits may be used; the value is always interpreted as a
+ decimal number. The hour must be between 0 and 24, and the
+ minutes (and seconds) if present between 0 and 59. Out of
+ range values may cause unpredictable behavior. If preceded by
+ a `-', the timezone is east of the Prime Meridian; otherwise
+ it is west (which may be indicated by an optional preceding
+ `+' sign).
+ **CLARIFICATION** No more than two digits are allowed in any
+ of <hh>, <mm> or <ss>. Leading zeros are permitted.
+<start>/<time> and <end>/<time>
+ Indicates when to change to and back from daylight savings
+ time, where <start>/<time> describes when the change from
+ standard time to daylight savings time occurs, and
+ <end>/<time> describes when the change back happens. Each
+ <time> field describes when, in current local time, the change
+ is made.
+ **CLARIFICATION** It is recognized that in the Southern
+ hemisphere <start> will specify a date later than <end>.
+ The formats of <start> and <end> are one of the following:
+ J<n> The Julian day <n> (1 <= <n> <= 365). Leap days are not
+ counted. That is, in all years, February 28 is day 59
+ and March 1 is day 60. It is impossible to refer to
+ the occasional February 29.
+ <n> The zero-based Julian day (0 <= <n> <= 365). Leap days
+ are counted, and it is possible to refer to February
+ 29.
+ M<m>.<n>.<d>
+ The <d>th day, (0 <= <d> <= 6) of week <n> of month <m>
+ of the year (1 <= <n> <= 5, 1 <= <m> <= 12), where week
+ 5 means `the last <d>-day in month <m>' (which may
+ occur in either the fourth or the fifth week). Week 1
+ is the first week in which the <d>th day occurs. Day
+ zero is Sunday.
+ **CLARIFICATION** Neither <n> nor <m> may have a
+ leading zero. <d> must be a single digit.
+ **CLARIFICATION** The default <start> and <end> values
+ are from the first Sunday in April until the last Sunday
+ in October. This allows United States users to leave out
+ the <start> and <end> parts, as most are accustomed to
+ doing.
+ <time> has the same format as <offset> except that no leading
+ sign (`-' or `+') is allowed. The default, if <time> is not
+ given is 02:00:00.
+ **CLARIFICATION** The number of hours in <time> may be up
+ to 167, to allow encoding of rules such as `00:00hrs on the
+ Sunday after the second Friday in September'
+
+Example (for Central Europe):
+-----------------------------
+MET-1MEST,M3.5.0,M10.5.0/03
+
+Another example, for the US East Coast:
+---------------------------------------
+EST5EDT4,M4.1.0/02,M10.5.0/02
+This string describes the default values when no time zone is set.
diff --git a/proginfo/txtvsbin.txt b/proginfo/txtvsbin.txt
new file mode 100644
index 0000000..6ba2805
--- /dev/null
+++ b/proginfo/txtvsbin.txt
@@ -0,0 +1,112 @@
+A Fast Method of Identifying Plain Text Files
+=============================================
+
+
+Introduction
+------------
+
+Given a file coming from an unknown source, it is generally impossible
+to conclude automatically, and with 100% accuracy, whether that file is
+a plain text file, without performing a heavy-duty semantic analysis on
+the file contents. It is, however, possible to obtain a fairly high
+degree of accuracy, by employing various simple heuristics.
+
+Previous versions of the zip tools were using a crude detection scheme,
+originally used by PKWare in its PKZip programs: if more than 80% (4/5)
+of the bytes are within the range [7..127], the file is labeled as plain
+text, otherwise it is labeled as binary. A prominent limitation of this
+scheme is the restriction to Latin-based alphabets. Other alphabets,
+like Greek, Cyrillic or Asian, make extensive use of the bytes within
+the range [128..255], and texts using these alphabets are most often
+mis-identified by this scheme; in other words, the rate of false
+negatives is sometimes too high, which means that the recall is low.
+Another weakness of this scheme is a reduced precision, due to the false
+positives that may occur when binary files containing a large amount of
+textual characters are mis-identified as plain text.
+
+In this article we propose a new detection scheme, with a much increased
+accuracy and precision, and a near-100% recall. This scheme is designed
+to work on ASCII and ASCII-derived alphabets, and it handles single-byte
+alphabets (ISO-8859, OEM, KOI-8, etc.), and variable-sized alphabets
+(DBCS, UTF-8, etc.). However, it cannot handle fixed-sized, multi-byte
+alphabets (UCS-2, UCS-4), nor UTF-16. The principle used by this scheme
+can easily be adapted to non-ASCII alphabets like EBCDIC.
+
+
+The Algorithm
+-------------
+
+The algorithm works by dividing the set of bytes [0..255] into three
+categories:
+- The white list of textual bytecodes:
+ 9 (TAB), 10 (LF), 13 (CR), 20 (SPACE) to 255
+- The gray list of tolerated bytecodes:
+ 7 (BEL), 8 (BS), 11 (VT), 12 (FF), 26 (SUB), 27 (ESC)
+- The black list of undesired, non-textual bytecodes:
+ 0 (NUL) to 6, 14 to 31.
+
+If a file contains at least one byte that belongs to the white list, and
+no byte that belongs to the black list, then the file is categorized as
+plain text. Otherwise, it is categorized as binary.
+
+
+Rationale
+---------
+
+The idea behind this algorithm relies on two observations.
+
+The first observation is that, although the full range of 7-bit codes
+(0..127) is properly specified by the ASCII standard, most control
+characters in the range 0..31 are not used in practice. The only
+widely-used, almost universally-portable control codes are 9 (TAB),
+10 (LF), and 13 (CR). There are a few more control codes that are
+recognized on a reduced range of platforms and text viewers/editors:
+7 (BEL), 8 (BS), 11 (VT), 12 (FF), 26 (SUB), and 27 (ESC); but these
+codes are rarely (if ever) used alone, without being accompanied by
+some printable text. Even the newer, portable text formats, such as
+XML, avoid using control characters outside the list mentioned here.
+
+The second observation is that most of the binary files tend to contain
+control characters, especially 0 (NUL); even though the older text
+detection schemes observe the presence of non-ASCII codes from the range
+[128..255], the precision rarely has to suffer if this upper range is
+labeled as textual, because the files that are genuinely binary tend to
+contain both control characters, and codes from the upper range. On the
+other hand, the upper range needs to be labeled as textual, because it
+is being used by virtually all ASCII extensions. In particular, this
+range is being heavily used to encode non-Latin scripts.
+
+Given the two observations, the plain text detection algorithm becomes
+straightforward. There must be at least some printable material, or
+some portable whitespace such as TAB, CR or LF, otherwise the file is
+not labeled as plain text. (The boundary case, when the file is empty,
+automatically falls into this category.) However, there must be no
+non-portable control characters, otherwise it's very likely that the
+intended reader of that file is a machine, rather than a human.
+
+Since there is no counting involved, other than simply observing the
+presence or the absence of some byte values, the algorithm produces
+uniform results on any particular text file, no matter what alphabet
+encoding is being used for that text. (In contrast, if counting were
+involved, it could be possible to obtain different results on a text
+encoded, say, using ISO-8859-2 versus UTF-8.) There is the category
+of plain text files that are "polluted" with one or a few black-listed
+codes, either by mistake, or by peculiar design considerations. In such
+cases, a scheme that tolerates a small percentage of black-listed codes
+would provide an increased recall (i.e. more true positives). This,
+however, incurs a reduced precision, since false positives are also more
+likely to appear in binary files that contain large chunks of textual
+data. "Polluted" plain text may, in fact, be regarded as binary, on
+which text conversions should not be performed. Under this premise, it
+is safe to say that the detection method provides a near-100% recall.
+
+Experiments have been run on a large set of files of various categories,
+including plain old texts, system logs, source code, formatted office
+documents, compiled object code, etcetera. The results confirm the
+optimistic assumptions about the high accuracy, precision and recall
+offered by this algorithm.
+
+
+--
+Cosmin Truta
+Last updated: 2005-Feb-27
diff --git a/proginfo/ziplimit.txt b/proginfo/ziplimit.txt
new file mode 100644
index 0000000..feacabb
--- /dev/null
+++ b/proginfo/ziplimit.txt
@@ -0,0 +1,243 @@
+ziplimit.txt
+
+Zip 3 and UnZip 6 now support many of the extended limits of Zip64.
+
+A) Hard limits of the Zip archive format:
+
+ Number of entries in Zip archive: 64 k (2^16 - 1 entries)
+ Compressed size of archive entry: 4 GByte (2^32 - 1 Bytes)
+ Uncompressed size of entry: 4 GByte (2^32 - 1 Bytes)
+ Size of single-volume Zip archive: 4 GByte (2^32 - 1 Bytes)
+ Per-volume size of multi-volume archives: 4 GByte (2^32 - 1 Bytes)
+ Number of parts for multi-volume archives: 64 k (1^16 - 1 parts)
+ Total size of multi-volume archive: 256 TByte (4G * 64k)
+
+ The number of archive entries and of multivolume parts are limited by
+ the structure of the "end-of-central-directory" record, where the these
+ numbers are stored in 2-Byte fields.
+ Some Zip and/or UnZip implementations (for example Info-ZIP's) allow
+ handling of archives with more than 64k entries. (The information
+ from "number of entries" field in the "end-of-central-directory" record
+ is not really neccessary to retrieve the contents of a Zip archive;
+ it should rather be used for consistency checks.)
+
+ Length of an archive entry name: 64 kByte (2^16 - 1)
+ Length of archive member comment: 64 kByte (2^16 - 1)
+ Total length of "extra field": 64 kByte (2^16 - 1)
+ Length of a single e.f. block: 64 kByte (2^16 - 1)
+ Length of archive comment: 64 KByte (2^16 - 1)
+
+ Additional limitation claimed by PKWARE:
+ Size of local-header structure (fixed fields of 30 Bytes + filename
+ local extra field): < 64 kByte
+ Size of central-directory structure (46 Bytes + filename +
+ central extra field + member comment): < 64 kByte
+
+ Note:
+ In 2001, PKWARE has published version 4.5 of the Zip format specification
+ (together with the release of PKZIP for Windows 4.5). This specification
+ defines new extra field blocks that allow to break the size limits of the
+ standard zipfile structures. In this extended Zip format, the size limits
+ of zip entries (and the complete zip archive) have been extended to
+ (2^64 - 1) Bytes and the maximum number of archive entries to (2^32-1).
+ Zip 3.0 supports these Zip64 extensions and should be released shortly.
+ UnZip 6.0 should support these standards.
+
+B) Implementation limits of UnZip:
+
+ Note:
+ This section should be updated when UnZip 6.0 is near release.
+
+ 1. Size limits caused by file I/O and decompression handling:
+ Size of Zip archive: 2 GByte (2^31 - 1 Bytes)
+ Compressed size of archive entry: 2 GByte (2^31 - 1 Bytes)
+
+ Note: On some systems, UnZip may support archive sizes up to 4 GByte.
+ To get this support, the target environment has to meet the following
+ requirements:
+ a) The compiler's intrinsic "long" data types must be able to hold
+ integer numbers of 2^32. In other words - the standard intrinsic
+ integer types "long" and "unsigned long" have to be wider than
+ 32 bit.
+ b) The system has to supply a C runtime library that is compatible
+ with the more-than-32-bit-wide "long int" type of condition a)
+ c) The standard file positioning functions fseek(), ftell() (and/or
+ the Unix style lseek() and tell() functions) have to be capable
+ to move to absolute file offsets of up to 4 GByte from the file
+ start.
+ On 32-bit CPU hardware, you generally cannot expect that a C compiler
+ provides a "long int" type that is wider than 32-bit. So, many of the
+ most popular systems (i386, PowerPC, 680x0, et. al) are out of luck.
+ You may find environment that provide all requirements on systems
+ with 64-bit CPU hardware. Examples might be Cray number crunchers
+ or Compaq (former DEC) Alpha AXP machines.
+
+ The number of Zip archive entries is unlimited. The "number-of-entries"
+ field of the "end-of-central-dir" record is checked against the "number
+ of entries found in the central directory" modulus 64k (2^16).
+
+ Multi-volume archive extraction is not supported.
+
+ Memory requirements are mostly independent of the archive size
+ and archive contents.
+ In general, UnZip needs a fixed amount of internal buffer space
+ plus the size to hold the complete information of the currently
+ processed entry's local header. Here, a large extra field
+ (could be up to 64 kByte) may exceed the available memory
+ for MSDOS 16-bit executables (when they were compiled in small
+ or medium memory model, with a fixed 64kByte limit on data space).
+
+ The other exception where memory requirements scale with "larger"
+ archives is the "restore directory attributes" feature. Here, the
+ directory attributes info for each restored directory has to be held
+ in memory until the whole archive has been processed. So, the amount
+ of memory needed to keep this info scales with the number of restored
+ directories and may cause memory problems when a lot of directories
+ are restored in a single run.
+
+C) Implementation limits of the Zip executables:
+
+ Note:
+ This section has been updated to reflect Zip 3.0.
+
+ 1. Size limits caused by file I/O and compression handling:
+ Without Zip64 extensions:
+ Size of Zip archive: 2 GByte (2^31 - 1 Bytes)
+ Compressed size of archive entry: 2 GByte (2^31 - 1 Bytes)
+ Uncompressed size of entry: 2 GByte (2^31 - 1 Bytes),
+ (could/should be 4 GBytes...)
+ Using Zip64 extensions:
+ Size of Zip archive: 2^63 - 1 Bytes
+ Compressed size of archive entry: 2^63 - 1 Bytes
+ Uncompressed size of entry: 2^63 - 1 Bytes
+
+ Multi-volume archive creation now supported in the form of split
+ archvies. Currently up to 99,999 splits are supported.
+
+ 2. Limits caused by handling of archive contents lists
+
+ 2.1. Number of archive entries (freshen, update, delete)
+ a) 16-bit executable: 64k (2^16 -1) or 32k (2^15 - 1),
+ (unsigned vs. signed type of size_t)
+ a1) 16-bit executable: <16k ((2^16)/4)
+ (The smaller limit a1) results from the array size limit of
+ the "qsort()" function.)
+
+ 32-bit executables: <1G ((2^32)/4)
+ (usual system limit of the "qsort()" function on 32-bit systems)
+
+ b) stack space needed by qsort to sort list of archive entries
+
+ NOTE: In the current executables, overflows of limits a) and b) are NOT
+ checked!
+
+ c) amount of free memory to hold "central directory information" of
+ all archive entries; one entry needs:
+ 96 bytes (32-bit) resp. 80 bytes (16-bit)
+ + 3 * length of entry name
+ + length of zip entry comment (when present)
+ + length of extra field(s) (when present, e.g.: UT needs 9 bytes)
+ + some bytes for book-keeping of memory allocation
+
+ Conclusion:
+ For systems with limited memory space (MSDOS, small AMIGAs, other
+ environments without virtual memory), the number of archive entries
+ is most often limited by condition c).
+ For example, with approx. 100 kBytes of free memory after loading and
+ initializing the program, a 16-bit DOS Zip cannot process more than 600
+ to 1000 (+) archive entries. (For the 16-bit Windows DLL or the 16-bit
+ OS/2 port, limit c) is less important because Windows or OS/2 executables
+ are not restricted to the 1024k area of real mode memory. These 16-bit
+ ports are limited by conditions a1) and b), say: at maximum approx.
+ 16000 entries!)
+
+
+ 2.2. Number of "new" entries (add operation)
+ In addition to the restrictions above (2.1.), the following limits
+ caused by the handling of the "new files" list apply:
+
+ a) 16-bit executable: <16k ((2^64)/4)
+
+ b) stack size required for "qsort" operation on "new entries" list.
+
+ NOTE: In the current executables, the overflow checks for these limits
+ are missing!
+
+ c) amount of free memory to hold the directory info list for new entries;
+ one entry needs:
+ 24 bytes (32-bit) resp. 22 bytes (16-bit)
+ + 3 * length of filename
+
+ NOTE: For larger systems, the actual limits may be more performance
+ issues (how long you want to wait) rather than available memory and other
+ resources.
+
+D) Some technical remarks:
+
+ 1. For executables compiled without LARGE_FILE_SUPPORT and ZIP64_SUPPORT
+ enabled, the 2GByte size limit on archive files is a consequence of
+ the portable C implementation of the Info-ZIP programs. Zip archive
+ processing requires random access to the archive file for jumping
+ between different parts of the archive's structure. In standard C,
+ this is done via stdio functions fseek()/ftell() resp. unix-io functions
+ lseek()/tell(). In many (most?) C implementations, these functions use
+ "signed long" variables to hold offset pointers into sequential files.
+ In most cases, this is a signed 32-bit number, which is limited to
+ ca. 2E+09. There may be specific C runtime library implementations
+ that interpret the offset numbers as unsigned, but for us, this is not
+ reliable in the context of portable programming.
+
+ If LARGE_FILE_SUPPORT and ZIP64_SUPPORT are defined and supported by
+ the system, 64-bit off_t file offsets are supported and the above
+ larger limits are supported. As off_t is signed, the maximum offset
+ is usually limited to 2^63 - 1.
+
+ 2. The 2GByte limit on the size of a single compressed archive member
+ is again a consequence of the implementation in C.
+ The variables used internally to count the size of the compressed
+ data stream are of type "long", which is guaranted to be at least
+ 32-bit wide on all supported environments.
+
+ But, why do we use "signed" long and not "unsigned long"?
+
+ Throughout the I/O handling of the compressed data stream, the
+ sign bit of the "long" numbers is (mis-)used as a kind of overflow
+ detection. In the end, this is caused by the fact that standard C
+ lacks any overflow checking on integer arithmetics and does not
+ support access to the underlying hardware's overflow detection
+ (the status bits, especially "carry" and "overflow" of the CPU's
+ flags-register) in a system-independent manner.
+
+ So, we "misuse" the most-significant bit of the compressed data
+ size counters as carry bit for efficient overflow/underflow detection.
+ We could change the code to a different method of overflow detection,
+ by using a bunch of "sanity" comparisons (kind of "is the calculated
+ result plausible when compared with the operands"). But, this would
+ "blow up" the code of the "inner loop", with remarkable loss of
+ processing speed. Or, we could reduce the amount of consistency checks
+ of the compressed data (e.g. detection of premature end of stream) to
+ an absolute minimum, at the cost of the programs' stability when
+ processing corrupted data.
+
+ Summary: Changing the compression/decompression core routines to
+ be "unsigned safe" would require excessive recoding, with little
+ gain on maximum processable uncompressed size (a gain can only be
+ expected for hardly compressable data), but at severe costs on
+ performance, stability and maintainability. Therefore, it is
+ quite unlikely that this will ever happen for Zip/UnZip.
+
+ With LARGE_FILE_SUPPORT and ZIP64_SUPPORT enabled and supported,
+ the above arguments still apply, but the limits are based on 64 bits
+ instead of 32 and should allow most large files and archives to be
+ processed.
+
+ Anyway, the Zip archive format is more and more showing its age...
+ The effort to lift the 2GByte limits should be better invested in
+ creating a successor for the Zip archive format and tools. But given
+ the latest improvements to the format and the wide acceptance of zip
+ files, the format will probably be around for awhile more.
+
+Please report any problems using the web contact form at: www.Info-ZIP.org
+
+Last updated: 26 January 2002, Christian Spieler
+ 25 May 2008, Ed Gordon