A general-purpose TIFF library
http://okmij.org/ftp/Streams.html#random-bin-IO
The library gives the user the TIFF dictionary, which the user can search for specific tags and obtain the values associated with the tags, including the pixel matrix.
The overarching theme is incremental processing: initially, only the TIFF dictionary is read. The value associated with a tag is read only when that tag is looked up (unless the value was short and was packed in the TIFF dictionary entry). The pixel matrix (let alone the whole TIFF file) is not loaded in memory -- the pixel matrix is not even located before it is needed. The matrix is processed incrementally, by a user-supplied iteratee.
The incremental processing is accomplished by iteratees and enumerators. The enumerators are indeed first-class, they are stored in the interned TIFF dictionary data structure. These enumerators represent the values associated with tags; the values will be read on demand, when the enumerator is applied to a user-given iteratee.
The library extensively uses nested streams, tacitly converting the stream of raw bytes from the file into streams of integers, rationals and other user-friendly items. The pixel matrix is presented as a contiguous stream, regardless of its segmentation into strips and physical arrangement. The library exhibits random IO and binary parsing, reading of multi-byte numeric data in big- or little-endian formats. The library can be easily adopted for AIFF, RIFF and other IFF formats.
We show a representative application of the library: reading a sample TIFF file, printing selected values from the TIFF dictionary, verifying the values of selected pixels and computing the histogram of pixel values. The pixel verification procedure stops reading the pixel matrix as soon as all specified pixel values are verified. The histogram accumulation does read the entire matrix, but incrementally. Neither pixel matrix processing procedure loads the whole matrix in memory. In fact, we never read and retain more than the IO-buffer-full of raw data.
- compute_hist :: TIFFDict -> IterateeGM Word8 RBIO (Int, IntMap Int)
- type EnumeratorGMM elfrom elto m a = IterateeG elto m a -> IterateeGM elfrom m a
- type TIFFDict = IntMap TIFFDE
- data TIFFDE = TIFFDE {}
- data TIFFDE_ENUM
- = TEN_CHAR (forall a. EnumeratorGMM Word8 Char RBIO a)
- | TEN_BYTE (forall a. EnumeratorGMM Word8 Word8 RBIO a)
- | TEN_INT (forall a. EnumeratorGMM Word8 Integer RBIO a)
- | TEN_RAT (forall a. EnumeratorGMM Word8 Rational RBIO a)
- data TIFF_TYPE
- = TT_NONE
- | TT_byte
- | TT_ascii
- | TT_short
- | TT_long
- | TT_rational
- | TT_sbyte
- | TT_undefined
- | TT_sshort
- | TT_slong
- | TT_srational
- | TT_float
- | TT_double
- data TIFF_TAG
- = TG_other Int
- | TG_SUBFILETYPE
- | TG_OSUBFILETYPE
- | TG_IMAGEWIDTH
- | TG_IMAGELENGTH
- | TG_BITSPERSAMPLE
- | TG_COMPRESSION
- | TG_PHOTOMETRIC
- | TG_THRESHOLDING
- | TG_CELLWIDTH
- | TG_CELLLENGTH
- | TG_FILLORDER
- | TG_DOCUMENTNAME
- | TG_IMAGEDESCRIPTION
- | TG_MAKE
- | TG_MODEL
- | TG_STRIPOFFSETS
- | TG_ORIENTATION
- | TG_SAMPLESPERPIXEL
- | TG_ROWSPERSTRIP
- | TG_STRIPBYTECOUNTS
- | TG_MINSAMPLEVALUE
- | TG_MAXSAMPLEVALUE
- | TG_XRESOLUTION
- | TG_YRESOLUTION
- | TG_PLANARCONFIG
- | TG_PAGENAME
- | TG_XPOSITION
- | TG_YPOSITION
- | TG_FREEOFFSETS
- | TG_FREEBYTECOUNTS
- | TG_GRAYRESPONSEUNIT
- | TG_GRAYRESPONSECURVE
- | TG_GROUP3OPTIONS
- | TG_GROUP4OPTIONS
- | TG_RESOLUTIONUNIT
- | TG_PAGENUMBER
- | TG_COLORRESPONSEUNIT
- | TG_COLORRESPONSECURVE
- | TG_SOFTWARE
- | TG_DATETIME
- | TG_ARTIST
- | TG_HOSTCOMPUTER
- | TG_PREDICTOR
- | TG_WHITEPOINT
- | TG_PRIMARYCHROMATICITIES
- | TG_COLORMAP
- | TG_BADFAXLINES
- | TG_CLEANFAXDATA
- | TG_CONSECUTIVEBADFAXLINES
- | TG_MATTEING
- tag_to_int :: TIFF_TAG -> Int
- int_to_tag :: Int -> TIFF_TAG
- tiff_reader :: IterateeGM Word8 RBIO (Maybe TIFFDict)
- u32_to_float :: Word32 -> Double
- u32_to_s32 :: Word32 -> Int32
- u16_to_s16 :: Word16 -> Int16
- u8_to_s8 :: Word8 -> Int8
- note :: [String] -> IterateeGM el RBIO ()
- load_dict :: IterateeGM Word8 RBIO (Maybe TIFFDict)
- pixel_matrix_enum :: TIFFDict -> EnumeratorN Word8 Word8 RBIO a
- dict_read_int :: TIFF_TAG -> TIFFDict -> IterateeGM Word8 RBIO (Maybe Integer)
- dict_read_ints :: TIFF_TAG -> TIFFDict -> IterateeGM Word8 RBIO (Maybe [Integer])
- dict_read_rat :: TIFF_TAG -> TIFFDict -> IterateeGM Word8 RBIO (Maybe Rational)
- dict_read_string :: TIFF_TAG -> TIFFDict -> IterateeGM Word8 RBIO (Maybe String)
Documentation
compute_hist :: TIFFDict -> IterateeGM Word8 RBIO (Int, IntMap Int)Source
Sample TIFF user code The following is sample code using the TIFF library (whose implementation is in the second part of this file). Our sample code prints interesting information from the TIFF dictionary (such as the dimensions, the resolution and the name of the image)
The sample file is a GNU logo (from http:www.gnu.org) converted from JPG to TIFF. Copyleft by GNU.
The main user function. tiff_reader is the library function, which builds the TIFF dictionary. process_tiff is the user function, to extract useful data from the dictionary
Sample TIFF processing function
sample processing of the pixel matrix: computing the histogram
type EnumeratorGMM elfrom elto m a = IterateeG elto m a -> IterateeGM elfrom m aSource
Another sample processor of the pixel matrix: verifying values of some pixels This processor does not read the whole matrix; it stops as soon as everything is verified or the error is detected
TIFF library code
We need a more general enumerator type: enumerator that maps streams (not necessarily in lock-step). This is a flattened (`joinI-ed') EnumeratorN elfrom elto m a
type TIFFDict = IntMap TIFFDESource
A TIFF directory is a finite map associating a TIFF tag with a record TIFFDE
data TIFFDE_ENUM Source
TEN_CHAR (forall a. EnumeratorGMM Word8 Char RBIO a) | |
TEN_BYTE (forall a. EnumeratorGMM Word8 Word8 RBIO a) | |
TEN_INT (forall a. EnumeratorGMM Word8 Integer RBIO a) | |
TEN_RAT (forall a. EnumeratorGMM Word8 Rational RBIO a) |
Standard TIFF data types
Standard TIFF tags
tag_to_int :: TIFF_TAG -> IntSource
int_to_tag :: Int -> TIFF_TAGSource
tiff_reader :: IterateeGM Word8 RBIO (Maybe TIFFDict)Source
The library function to read the TIFF dictionary
u32_to_float :: Word32 -> DoubleSource
A few conversion procedures
u32_to_s32 :: Word32 -> Int32Source
u16_to_s16 :: Word16 -> Int16Source
load_dict :: IterateeGM Word8 RBIO (Maybe TIFFDict)Source
An internal function to load the dictionary. It assumes that the stream is positioned to read the dictionary
pixel_matrix_enum :: TIFFDict -> EnumeratorN Word8 Word8 RBIO aSource
Reading the pixel matrix For simplicity, we assume no compression and 8-bit pixels
dict_read_int :: TIFF_TAG -> TIFFDict -> IterateeGM Word8 RBIO (Maybe Integer)Source
A few helpers for getting data from TIFF dictionary
dict_read_ints :: TIFF_TAG -> TIFFDict -> IterateeGM Word8 RBIO (Maybe [Integer])Source
dict_read_rat :: TIFF_TAG -> TIFFDict -> IterateeGM Word8 RBIO (Maybe Rational)Source
dict_read_string :: TIFF_TAG -> TIFFDict -> IterateeGM Word8 RBIO (Maybe String)Source