Modifying The TIFF Library
==========================

.. image:: images/dave.gif
    :width: 107
    :alt: dave

This chapter provides information about the internal structure of
the library, how to control the configuration when building it, and
how to add new support to the library.
The following sections are found in this chapter:

Library Configuration
---------------------

Information on compiling the library is given :doc:`build`
elsewhere in this documentation.
This section describes the low-level mechanisms used to control
the optional parts of the library that are configured at build
time.  Control is based on
a collection of C defines that are specified either on the compiler
command line or in a configuration file such as :file:`port.h`
(as generated by the :program:`configure` script for UNIX systems)
or :file:`tiffconf.h`.

Configuration defines are split into three areas:

* those that control which compression schemes are
  configured as part of the builtin codecs,
* those that control support for groups of tags that
  are considered optional, and
* those that control operating system or machine-specific support.

The following built-in compression algorithms are enabled by default:

* CCITT Group 3 and 4 algorithms (compression codes 2, 3, 4, and 32771),
* the Macintosh PackBits algorithm (compression 32773),
* a 4-bit run-length encoding scheme from ThunderScan (compression 32809),
* a 2-bit encoding scheme used by NeXT (compression 32766), and
* two experimental schemes intended for images with high dynamic range
  (compression 34676 and 34677).

To override the default compression behaviour, set the appropriate
defines enable configuration of the appropriate codecs (see the list
below); e.g. :c:macro:`PACKBITS_SUPPORT` and :c:macro:`CCITT_SUPPORT`.

Several other compression schemes are configured separately from
the default set because they depend on ancillary software
packages that are not distributed with LibTIFF.  They will be
enabled automatically if the CMake or Autotools build configuration
detects them, or they may be explicitly enabled or disabled.

Support for JPEG compression is controlled by :c:macro:`JPEG_SUPPORT`.
The JPEG codec that comes with LibTIFF is designed for
use with release 5 or later of the Independent JPEG Group's freely
available software distribution.
This software can be retrieved from the directory
`<ftp://ftp.uu.net/graphics/jpeg>`_.

.. note::

    Enabling JPEG support automatically enables support for
    the TIFF 6.0 colorimetry and YCbCr-related tags.

.. c:macro:: DEFLATE_SUPPORT

    Enable Deflate support

Experimental support for the deflate algorithm is controlled by
:c:macro:`DEFLATE_SUPPORT`.
The deflate codec that comes with LibTIFF is designed
for use with version 0.99 or later of the freely available
``libz`` library written by Jean-loup Gailly and Mark Adler.
The data format used by this library is described
in the files
`<ftp://ftp.uu.net/pub/archiving/zip/doc/zlib-3.1.doc>`_,
and
`<ftp://ftp.uu.net/pub/archiving/zip/doc/deflate-1.1.doc>`_,
available in the directory
`<ftp://ftp.uu.net/pub/archiving/zip/doc>`_..
The library can be retried from the directory
`<ftp://ftp.uu.net/pub/archiving/zip/zlib>`_
(or try `<ftp://quest.jpl.nasa.gov/beta/zlib/>`_).

.. warning::

    **The deflate algorithm is experimental.  Do not expect
    to exchange files using this compression scheme;
    it is included only because the similar, and more common,
    LZW algorithm is claimed to be governed by licensing restrictions.**

By default :file:`tiffconf.h` defines
:c:macro:`COLORIMETRY_SUPPORT`,
:c:macro:`YCBCR_SUPPORT`,
and 
:c:macro:`CMYK_SUPPORT`.


:file:`tiffconf.h` defines:

.. c:macro:: CCITT_SUPPORT

    CCITT Group 3 and 4 algorithms (compression codes 2, 3, 4, and 32771)

.. c:macro:: PACKBITS_SUPPORT

    Macintosh PackBits algorithm (compression 32773)

.. c:macro:: LZW_SUPPORT

    Lempel-Ziv & Welch (LZW) algorithm (compression 5)

.. c:macro:: THUNDER_SUPPORT

    4-bit run-length encoding scheme from ThunderScan (compression 32809)

.. c:macro:: NEXT_SUPPORT

    2-bit encoding scheme used by NeXT (compression 32766)

.. c:macro:: OJPEG_SUPPORT

    obsolete JPEG scheme defined in the 6.0 spec (compression 6)

.. c:macro:: JPEG_SUPPORT

    current JPEG scheme defined in TTN2 (compression 7)

.. c:macro:: ZIP_SUPPORT

    experimental Deflate scheme (compression 32946)

.. c:macro:: PIXARLOG_SUPPORT

    Pixar's compression scheme for high-resolution color images (compression 32909)

.. c:macro:: SGILOG_SUPPORT

    SGI's compression scheme for high-resolution color images (compression 34676 and 34677)

.. c:macro:: COLORIMETRY_SUPPORT

    support for the TIFF 6.0 colorimetry tags

.. c:macro:: YCBCR_SUPPORT

    support for the TIFF 6.0 YCbCr-related tags

.. c:macro:: CMYK_SUPPORT

    support for the TIFF 6.0 CMYK-related tags

.. c:macro:: ICC_SUPPORT

    support for the ICC Profile tag; see
    *The ICC Profile Format Specification*,
    Annex B.3 "Embedding ICC Profiles in TIFF Files";
    available at `<http://www.color.org/>`_

General Portability Comments
----------------------------

This software is developed on Silicon Graphics UNIX
systems (big-endian, MIPS CPU, 32-bit ints,
IEEE floating point). 
The :program:`configure` shell script generates the appropriate
include files and make files for UNIX systems.
Makefiles exist for non-UNIX platforms that the
code runs on---this work has mostly been done by other people.

In general, the code is guaranteed to work only on SGI machines.
In practice it is highly portable to any 32-bit or 64-bit system and much
work has been done to insure portability to 16-bit systems.
If you encounter portability problems please return fixes so
that future distributions can be improved.

The software is written to assume an ANSI C compilation environment.
If your compiler does not support ANSI function prototypes, ``const``,
and :file:`<stdarg.h>` then you will have to make modifications to the
software.  In the past I have tried to support compilers without ``const``
and systems without :file:`<stdarg.h>`, but I am
**no longer interested in these
antiquated environments**.  With the general availability of
the freely available GCC compiler, I
see no reason to incorporate modifications to the software for these
purposes.

An effort has been made to isolate as many of the
operating system-dependencies
as possible in two files: :file:`tiffcomp.h` and
:file:`libtiff/tif_<os>.c`.  The latter file contains
operating system-specific routines to do I/O and I/O-related operations.
The UNIX (:file:`tif_unix.c`) code has had the most use.

Native CPU byte order is determined on the fly by
the library and does not need to be specified.

The following defines control general portability:

.. c:macro:: HAVE_MMAP

    Define this if there is *mmap-style* support for
    mapping files into memory (used only to read data).

.. c:macro:: HOST_FILLORDER

    Define the native CPU bit order: one of :c:macro:`FILLORDER_MSB2LSB`
    or :c:macro:`FILLORDER_LSB2MSB`

.. c:macro:: HOST_BIGENDIAN

    Define the native CPU byte order: 1 if big-endian (Motorola)
    or 0 if little-endian (Intel); this may be used
    in codecs to optimize code

The :c:macro:`HOST_FILLORDER` and :c:macro:`HOST_BIGENDIAN`
definitions are not currently used, but may be employed by
codecs for optimization purposes.

On UNIX systems :c:macro:`HAVE_MMAP` is defined through the running of
the :program:`configure` script; otherwise support for memory-mapped
files is disabled.

Types and Portability
---------------------

The software makes extensive use of C typedefs to promote portability.
Two sets of typedefs are used, one for communication with clients
of the library and one for internal data structures and parsing of the
TIFF format.  There are interactions between these two to be careful
of, but for the most part you should be able to deal with portability
purely by fiddling with the following machine-dependent typedefs.  Note
that C99 :file:`stdint.h` types are used in most cases.

Included through :file:`tiff.h`:

.. c:type:: uint8_t

    8-bit unsigned integer

.. c:type:: int8_t

    8-bit signed integer

.. c:type:: uint16_t

    16-bit unsigned integer

.. c:type:: int16_t

    16-bit signed integer

.. c:type:: uint32_t

    32-bit unsigned integer

.. c:type:: int32_t

    32-bit signed integer

.. c:type:: uint64_t

    64-bit unsigned integer

.. c:type:: int64_t

    64-bit signed integer

.. c:type:: size_t

    C size type

.. c:type:: va_list

    Variable argument list

The public typedefs used throughout the library and in public interfaces are
described in Section :ref:`public-data-types`.

The following typedefs are used throughout the library and interfaces
to refer to certain objects whose size is dependent on the TIFF image
structure:

.. c:type:: unsigned char * tidata_t

    internal image data

The following macros are used from the standard library:

.. c:macro:: NULL

    Null pointer value

The following types are been used in the past and are obsoleted by the use of the
C library integer types, above:

.. c:type:: u_char

    Obsolete type.  Use :c:type:`uint8_t`.

.. c:type:: u_short

    Obsolete type.  Use :c:type:`uint16_t`.

.. c:type:: u_int

    Obsolete type.  Use :c:type:`uint32_t`.

.. c:type:: u_long

    Obsolete type.  Use :c:type:`uint64_t`.

.. c:type:: int8

    Obsolete type.  Use :c:type:`int8_t`.

.. c:type:: uint8

    Obsolete type.  Use :c:type:`uint8_t`.

.. c:type:: int16

    Obsolete type.  Use :c:type:`int16_t`.

.. c:type:: uint16

    Obsolete type.  Use :c:type:`uint16_t`.

.. c:type:: int32

    Obsolete type.  Use :c:type:`int32_t`.

.. c:type:: uint32

    Obsolete type.  Use :c:type:`uint32_t`.

.. c:type:: int64

    Obsolete type.  Use :c:type:`int64_t`.

.. c:type:: uint64

    Obsolete type.  Use :c:type:`uint64_t`.

.. c:type:: dblparam_t

    Obsolete type.  Use :c:expr:`double`.

The following C types and functions are used from the standard library:

.. c:type:: FILE

    File handle

.. c:function:: int memcmp(const void* lhs, const void* rhs, size_t count)

    See `memcmp <https://en.cppreference.com/w/c/string/byte/memcmp>`_

.. c:function:: void* memcpy(void *restrict dest, const void *restrict src, size_t count)

    See `memcpy <https://en.cppreference.com/w/c/string/byte/memcpy>`_

.. c:function:: void* memmove(void* dest, const void* src, size_t count)

    See `memmove <https://en.cppreference.com/w/c/string/byte/memmove>`_

.. c:function:: void *memset(void *dest, int ch, size_t count)

    See `memset <https://en.cppreference.com/w/c/string/byte/memset>`_

.. c:function:: long strtol(const char *restrict str, char **restrict str_end, int base)

    See `strtol <https://en.cppreference.com/w/c/string/byte/strtol>`_

.. c:function:: long long strtoll(const char *restrict str, char **restrict str_end, int base)

    See `strtoll <https://en.cppreference.com/w/c/string/byte/strtol>`_

.. c:function:: unsigned long strtoul(const char *restrict str, char **restrict str_end, int base)

    See `strtoul <https://en.cppreference.com/w/c/string/byte/strtoul>`_

.. c:function:: unsigned long long strtoull(const char *restrict str, char **restrict str_end, int base)

    See `strtoull <https://en.cppreference.com/w/c/string/byte/strtoul>`_

.. c:function:: void* bsearch(const void *key, const void *ptr, size_t count, size_t size, int (*comp)(const void*, const void*))

    See `bsearch <https://en.cppreference.com/w/c/algorithm/bsearch>`_

.. c:function:: void* malloc(size_t size)

    See `malloc <https://en.cppreference.com/w/c/memory/malloc>`_

.. c:function:: void *realloc(void *ptr, size_t new_size)

    See `realloc <https://en.cppreference.com/w/c/memory/realloc>`_

.. c:function:: void free(void* ptr)

    See `free <https://en.cppreference.com/w/c/memory/free>`_

.. c:function:: int printf(const char *restrict format, ...)

    See `printf <https://en.cppreference.com/w/c/io/fprintf>`_

.. c:function:: int snprintf(char *restrict buffer, size_t bufsz, const char *restrict format, ...)

    See `snprintf <https://en.cppreference.com/w/c/io/fprintf>`_

.. c:function:: int fscanf(FILE *stream, const char *format, ...)

   See `fscanf <https://en.cppreference.com/w/c/io/fscanf>`_

.. c:function:: double pow(double base, double exponent)

    See `pow <https://en.cppreference.com/w/c/numeric/math/pow>`_

The following POSIX types and functions are used from the standard library:

.. c:type:: ssize_t

    Signed size type

.. c:type:: off_t

    File offset

.. c:function:: int open(const char *path, int oflag, ...)

    See `open <https://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html>`_

.. c:function:: int close(int fildes)

    See `close <https://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html>`_

.. c:function:: ssize_t read(int fildes, void *buf, size_t nbyte)

    See `read <https://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html>`_

.. c:function:: ssize_t write(int fildes, const void *buf, size_t nbyte)

    See `write <https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html>`_

.. c:function:: off_t lseek(int fildes, off_t offset, int whence)

    See `lseek <https://pubs.opengroup.org/onlinepubs/9699919799/functions/lseek.html>`_

.. c:function:: int fseeko(FILE *stream, off_t offset, int whence)

    See `fseeko <https://pubs.opengroup.org/onlinepubs/9699919799/functions/fseeko.html>`_

.. c:function:: int mkstemp(char *template)

    See `mkstemp <https://pubs.opengroup.org/onlinepubs/9699919799/functions/mkstemp.html>`_

.. c:function:: void *mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t off)

    See `mmap <https://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html>`_

.. c:function:: int munmap(void *addr, size_t len)

    See `munmap <https://pubs.opengroup.org/onlinepubs/9699919799/functions/munmap.html>`_

.. c:function:: void *lfind(const void *key, const void *base, size_t *nelp, size_t width, int (*compar)(const void *, const void *))

   See `lfind <https://pubs.opengroup.org/onlinepubs/9699919799/functions/lfind.html>`_

The following Windows types and functions are used from the C runtime:

.. c:type:: BOOL

    Boolean type

.. c:type:: LONG

    Long integer type

.. c:type:: DWORD

    Double-length word

.. c:type:: HANDLE

    File handle

.. c:type:: LPCSTR

    Long pointer to constant string

.. c:type:: LPCWSTR

    Long pointer to constant wide string

.. c:type:: LPVOID

    Long pointer to void

.. c:type:: LPCVOID

    Long pointer to const void

.. c:type:: LPDWORD

    Long pointer to double-length word

.. c:type:: LPOVERLAPPED

    Long pointer to overlapped structure

.. c:type:: LPSECURITY_ATTRIBUTES

    Long pointer to security attributes

.. c:type:: PLONG

    Pointer to long

.. c:function:: HANDLE CreateFileW(LPCWSTR lpFileName, DWORD dwDesiredAccess, DWORD dwShareMode, LPSECURITY_ATTRIBUTES lpSecurityAttributes, DWORD dwCreationDisposition, DWORD dwFlagsAndAttributes, HANDLE hTemplateFile)

    See `CreateFileW <https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew>`_

.. c:function:: HANDLE CreateFileA(LPCSTR lpFileName, DWORD dwDesiredAccess, DWORD dwShareMode, LPSECURITY_ATTRIBUTES lpSecurityAttributes, DWORD dwCreationDisposition, DWORD dwFlagsAndAttributes, HANDLE hTemplateFile)

    See `CreateFileA <https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea>`_

.. c:function:: BOOL CloseHandle(HANDLE hObject)

    See `CloseHandle <https://docs.microsoft.com/en-us/windows/win32/api/handleapi/nf-handleapi-closehandle>`_

.. c:function:: BOOL ReadFile(HANDLE hFile, LPVOID lpBuffer, DWORD nNumberOfBytesToRead, LPDWORD lpNumberOfBytesRead, LPOVERLAPPED lpOverlapped)

    See `ReadFile <https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-readfile>`_

.. c:function:: BOOL WriteFile(HANDLE hFile, LPCVOID lpBuffer, DWORD nNumberOfBytesToWrite, LPDWORD lpNumberOfBytesWritten, LPOVERLAPPED lpOverlapped)

    See `WriteFile <https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile>`_

.. c:function:: DWORD SetFilePointer(HANDLE hFile, LONG lDistanceToMove, PLONG lpDistanceToMoveHigh, DWORD dwMoveMethod)

    See `SetFilePointer <https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfilepointer>`_

.. c:function:: HANDLE CreateFileMappingA(HANDLE hFile, LPSECURITY_ATTRIBUTES lpFileMappingAttributes, DWORD flProtect, DWORD dwMaximumSizeHigh, DWORD dwMaximumSizeLow, LPCSTR lpName)

    See `CreateFileMappingA <https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createfilemappinga>`_

.. c:function:: BOOL UnmapViewOfFile(LPCVOID lpBaseAddress)

    See `UnmapViewOfFile <https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-unmapviewoffile>`_

General Comments
----------------

The library is designed to hide as much of the details of TIFF from
applications as
possible.  In particular, TIFF directories are read in their entirety
into an internal format.  Only the tags known by the library are
available to a user and certain tag data may be maintained that a user
does not care about (e.g. transfer function tables).

Adding New Builtin Codecs
-------------------------

To add builtin support for a new compression algorithm, you can either
use the "tag-extension" trick to override the handling of the
TIFF Compression tag (see :doc:`addingtags`),
or do the following to add support directly to the core library:

* Define the tag value in :file:`tiff.h`.
* Edit the file :file:`tif_codec.c` to add an entry to the
  :c:var:`_TIFFBuiltinCODECS` array (see how other algorithms are handled).
* Add the appropriate function prototype declaration to
  :file:`tiffiop.h` (close to the bottom).
* Create a file with the compression scheme code, by convention files
  are named :file:`tif_*.c` (except perhaps on some systems where the
  ``tif_`` prefix pushes some filenames over 14 chars.
* Update build configuration to include new source file.

A codec, say ``foo``, can have many different entry points:

::

    TIFFInitfoo(tif, scheme) /* initialize scheme and setup entry points in tif */
    fooSetupDecode(tif)	/* called once per IFD after tags has been frozen */
    fooPreDecode(tif, sample) /* called once per strip/tile, after data is read,
                                 but before the first row is decoded */
    fooDecode*(tif, bp, cc, sample) /* decode cc bytes of data into the buffer */
        fooDecodeRow(...)	/* called to decode a single scanline */
        fooDecodeStrip(...)	/* called to decode an entire strip */
        fooDecodeTile(...)	/* called to decode an entire tile */
        fooSetupEncode(tif)	/* called once per IFD after tags has been frozen */
        fooPreEncode(tif, sample) /* called once per strip/tile, before the first row in
                                     a strip/tile is encoded */
    fooEncode*(tif, bp, cc, sample)/* encode cc bytes of user data (bp) */
        fooEncodeRow(...)	/* called to decode a single scanline */
        fooEncodeStrip(...)	/* called to decode an entire strip */
        fooEncodeTile(...)	/* called to decode an entire tile */
    fooPostEncode(tif)	/* called once per strip/tile, just before data is written */
    fooSeek(tif, row)	/* seek forwards row scanlines from the beginning
                           of a strip (row will always be <0 and >rows/strip */
    fooCleanup(tif) /* called when compression scheme is replaced by user */

Note that the encoding and decoding variants are only needed when
a compression algorithm is dependent on the structure of the data.
For example, Group 3 2D encoding and decoding maintains a reference
scanline.  The sample parameter identifies which sample is to be
encoded or decoded if the image is organized with ``PlanarConfig=2``
(separate planes).  This is important for algorithms such as JPEG.
If ``PlanarConfig=1`` (interleaved), then sample will always be 0.

Other Comments
--------------

The library handles most I/O buffering.  There are two data buffers
when decoding data: a raw data buffer that holds all the data in a
strip, and a user-supplied scanline buffer that compression schemes
place decoded data into.  When encoding data the data in the
user-supplied scanline buffer is encoded into the raw data buffer (from
where it is written).  Decoding routines should never have to explicitly
read data -- a full strip/tile's worth of raw data is read and scanlines
never cross strip boundaries.  Encoding routines must be cognizant of
the raw data buffer size and call :c:func:`TIFFFlushData1` when necessary.
Note that any pending data is automatically flushed when a new strip/tile is
started, so there's no need do that in the tif_postencode routine (if
one exists).  Bit order is automatically handled by the library when
a raw strip or tile is filled.  If the decoded samples are interpreted
by the decoding routine before they are passed back to the user, then
the decoding logic must handle byte-swapping by overriding the
:c:member:`tif_postdecode`
routine (set it to :c:func:`TIFFNoPostDecode`) and doing the required work
internally.  For an example of doing this look at the horizontal
differencing code in the routines in :file:`tif_predict.c`.

The variables :c:member:`tif_rawcc`, :c:member:`tif_rawdata`, and
:c:member:`tif_rawcp` in a :c:struct:`TIFF` structure
are associated with the raw data buffer.  :c:member:`tif_rawcc` must be non-zero
for the library to automatically flush data.  The variable
:c:member:`tif_scanlinesize` is the size a user's scanline buffer should be.  The
variable :c:member:`tif_tilesize` is the size of a tile for tiled images.  This
should not normally be used by compression routines, except where it
relates to the compression algorithm.  That is, the ``cc`` parameter to the
:c:expr:`tif_decode*` and :c:expr:`tif_encode*`
routines should be used in terminating
decompression/compression.  This ensures these routines can be used,
for example, to decode/encode entire strips of data.

In general, if you have a new compression algorithm to add, work from
the code for an existing routine.  In particular,
:file:`tif_dumpmode.c`
has the trivial code for the "nil" compression scheme,
:file:`tif_packbits.c` is a
simple byte-oriented scheme that has to watch out for buffer
boundaries, and :file:`tif_lzw.c` has the LZW scheme that has the most
complexity -- it tracks the buffer boundary at a bit level.
Of course, using a private compression scheme (or private tags) limits
the portability of your TIFF files.

Internal functions
------------------

The following functions are private and are not part of the public API.

.. c:function:: int _TIFFRewriteField(TIFF *, uint16_t, TIFFDataType, tmsize_t, void *)


The following functions are static and not part of the public or private API.

.. c:function:: int TIFFFetchNormalTag(TIFF* tif, TIFFDirEntry* dp, int recover)

    Fetch a normal tag, not covered by special-case code

.. c:function:: int TIFFWriteDirectoryTagData(TIFF* tif, uint32_t* ndir, TIFFDirEntry* dir, uint16_t tag, uint16_t datatype, uint32_t count, uint32_t datalength, void* data)

.. c:function:: int TIFFFetchStripThing(TIFF* tif, TIFFDirEntry* dir, uint32_t nstrips, uint64_t** lpp)

.. c:function:: int TIFFAppendToStrip(TIFF* tif, uint32_t strip, uint8_t* data, tmsize_t cc)