Random and Binary IO with IterateeM
http://okmij.org/ftp/Streams.html#random-bin-IO
Random and binary IO: Reading TIFF
Iteratees presuppose sequential processing. A general-purpose input method must also support random IO: processing a seek-able input stream from an arbitrary position, jumping back and forth through the stream. We demonstrate random IO with iteratees, as well as reading non-textual files and converting raw bytes into multi-byte quantities such as integers, rationals, and TIFF dictionaries. Positioning of the input stream is evocative of delimited continuations.
We use random and binary IO to write a general-purpose TIFF library. The library emphasizes incremental processing, relying on iteratees and enumerators for on-demand reading of tag values. The library extensively uses nested streams, tacitly converting the stream of raw bytes from the file into streams of integers, rationals and other user-friendly items. The pixel matrix is presented as a contiguous stream, regardless of its segmentation into strips and physical arrangement.
We show a representative application of the library: reading a sample TIFF file, printing selected values from the TIFF dictionary, verifying the values of selected pixels and computing the histogram of pixel values. The pixel verification procedure stops reading the pixel matrix as soon as all specified pixel values are verified. The histogram accumulation does read the entire matrix, but incrementally. Neither pixel matrix processing procedure loads the whole matrix in memory. In fact, we never read and retain more than the IO-buffer-full of raw data.
Version: The current version is 1.1, December 2008.
- newtype RBIO a = RBIO {}
- data RBState = RBState {}
- rb_seek_set :: FileOffset -> RBIO ()
- rb_seek_answered :: RBIO Bool
- rb_msb_first :: RBIO Bool
- rb_msb_first_set :: Bool -> RBIO ()
- runRB :: RBState -> IterateeGM el RBIO a -> IO (IterateeG el RBIO a)
- bindm :: Monad m => m (Maybe a) -> (a -> m (Maybe b)) -> m (Maybe b)
- sseek :: FileOffset -> IterateeGM el RBIO ()
- iter_err :: Monad m => String -> IterateeGM el m ()
- stakeR :: Monad m => Int -> EnumeratorN el el m a
- endian_read2 :: IterateeGM Word8 RBIO (Maybe Word16)
- endian_read4 :: IterateeGM Word8 RBIO (Maybe Word32)
- enum_fd_random :: Fd -> EnumeratorGM Word8 RBIO a
Documentation
The type of the IO monad supporting seek requests and endianness The seek_request is not-quite a state, more like a `communication channel' set by the iteratee and answered by the enumerator. Since the base monad is IO, it seems simpler to implement both endianness and seek requests as IORef cells. Their names are grouped in a structure RBState, which is propagated as the `environment.'
Generally, RBState is opaque and should not be exported.
rb_seek_set :: FileOffset -> RBIO ()Source
The programmer should use the following functions instead
To request seeking, the iteratee sets seek_req to (Just desired_offset) When the enumerator answers the request, it sets seek_req back to Nothing
rb_msb_first_set :: Bool -> RBIO ()Source
bindm :: Monad m => m (Maybe a) -> (a -> m (Maybe b)) -> m (Maybe b)Source
A useful combinator. Perhaps a better idea would have been to define Iteratee to have (Maybe a) in IE_done? In that case, we could make IterateeGM to be the instance of MonadPlus
sseek :: FileOffset -> IterateeGM el RBIO ()Source
We discard all available input first. We keep discarding the stream s until we determine that our request has been answered: rb_seek_set sets the state seek_req to (Just off). When the request is answered, the state goes back to Nothing. The above features remind one of delimited continuations.
iter_err :: Monad m => String -> IterateeGM el m ()Source
An iteratee that reports and propagates an error
We disregard the input first and then propagate error.
It is reminiscent of abort
stakeR :: Monad m => Int -> EnumeratorN el el m aSource
Read n elements from a stream and apply the given iteratee to the
stream of the read elements. If the given iteratee accepted fewer
elements, we stop.
This is the variation of stake
with the early termination
of processing of the outer stream once the processing of the inner stream
finished early. This variation is particularly useful for randomIO,
where we do not have to care to `drain the input stream'.
endian_read2 :: IterateeGM Word8 RBIO (Maybe Word16)Source
Iteratees to read unsigned integers written in Big- or Little-endian ways
enum_fd_random :: Fd -> EnumeratorGM Word8 RBIO aSource
The enumerator of a POSIX Fd: a variation of enum_fd that supports RandomIO (seek requests)