\section{Core types} Most of this library's types and functions are exported from the {\tt Data.Enumerator} module. :f Data/Enumerator.hs |Data.Enumerator module header| module Data.Enumerator ( |Data.Enumerator exports| ) where |Data.Enumerator imports| : \noindent A few utility functions share names with functions from the Prelude, so those are removed from the default namespace. :d Data.Enumerator imports import qualified Prelude as Prelude import Prelude hiding ( |excluded Prelude imports| ) : :d Data.Enumerator exports -- * Core -- ** Types Stream (..) , Iteratee (..) , Step (..) , Enumerator , Enumeratee : \subsection{Input streams} A {\tt Stream} is a sequence of chunks generated by an enumerator or enumeratee. Chunks might be composite values, such as a string, or atomic, such as a parser event. Allowing a stream to support multiple chunks slightly complicates iteratee and enumeratee implementation, but greatly simplifies handling of leftover inputs. {\tt (Chunks [])} is a legal value, used when a stream is still active but no data is currently available. Iteratees and enumeratees often special-case empty chunks for performance reasons, though they're not required to. :f Data/Enumerator.hs |apidoc Data.Enumerator.Stream| data Stream a = Chunks [a] | EOF deriving (Show, Eq) instance Monad Stream where return = Chunks . return Chunks xs >>= f = mconcat (fmap f xs) EOF >>= _ = EOF instance Functor Stream where fmap f (Chunks xs) = Chunks (fmap f xs) fmap _ EOF = EOF instance A.Applicative Stream where pure = return (<*>) = CM.ap : The {\tt Monoid} instance deserves some special attention, because it has the unexpected behavior that {\tt mappend EOF (Chunks []) == EOF}. Although it's reasonable that appending chunks to an {\sc eof} stream should provide a valid stream, such behavior would violate the monoid laws. :d Data.Enumerator imports import Data.Monoid (Monoid, mempty, mappend, mconcat) : :f Data/Enumerator.hs instance Monoid (Stream a) where mempty = Chunks mempty mappend (Chunks xs) (Chunks ys) = Chunks (xs ++ ys) mappend _ _ = EOF : \subsection{Iteratees} The primary data type for this library is {\tt Iteratee}, which consumes input until it either generates a value or encounters an error. Rather than requiring all input at once, an iteratee will return {\tt Continue} when it is capable of processing more data. In general, iteratees begin in the {\tt Continue} state. As each chunk is passed to the continuation, the iteratee may return the next step, which is one of: \begin{itemize} \item {\tt Continue}: The iteratee is capable of accepting more input. Note that more input is not required; the iteratee might be able to generate a value immediately if the stream ends. \item {\tt Yield}: The iteratee has received enough input to generate a result. Included in this value is left-over input, which can be passed to the next iteratee. \item {\tt Error}: The iteratee encountered an error which prevents it from proceeding further. \end{itemize} :d Data.Enumerator imports import qualified Control.Exception as Exc : :f Data/Enumerator.hs data Step a m b |apidoc Data.Enumerator.Continue| = Continue (Stream a -> Iteratee a m b) |apidoc Data.Enumerator.Yield| | Yield b (Stream a) |apidoc Data.Enumerator.Error| | Error Exc.SomeException |apidoc Data.Enumerator.Iteratee| newtype Iteratee a m b = Iteratee { runIteratee :: m (Step a m b) } : Users often need to construct iteratees which only yield or continue, so we define some helper functions to save typing: :f Data/Enumerator.hs |apidoc Data.Enumerator.returnI| returnI :: Monad m => Step a m b -> Iteratee a m b returnI step = Iteratee (return step) |apidoc Data.Enumerator.yield| yield :: Monad m => b -> Stream a -> Iteratee a m b yield x extra = returnI (Yield x extra) |apidoc Data.Enumerator.continue| continue :: Monad m => (Stream a -> Iteratee a m b) -> Iteratee a m b continue k = returnI (Continue k) : :d Data.Enumerator exports , returnI , yield , continue : \subsection{Enumerators} Enumerators typically read from an external source (parser, handle, random number generator, etc). They feed chunks into an iteratee until the source runs out of data (triggering {\tt EOF}) or the iteratee finishes processing (yields a value). Since {\tt Iteratee} is an alias for {\tt m (Step a m b)}, enumerators can also be considered step transformers of type {\tt Step a m b -> m (Step a m b)}. :f Data/Enumerator.hs |apidoc Data.Enumerator.Enumerator| type Enumerator a m b = Step a m b -> Iteratee a m b : Although enumerators can be encoded as a simple step transformer with the type {\tt Step a m b -> Step a m b}, encoding as a computation allows easier reasoning about the order of side effects. Consider the case of enumerating two files: :d enumerator example let iterFoo = enumFile "foo.txt" iterWhatever let iterBar = enumFile "bar.txt" iterFoo : It's impossible to determine, merely by looking at these lines, which file will be opened first. In fact, depending on the implementation of {\tt enumFile}, both files might be open at the same time. If enumerators return monadic values, the order of events is more clear: :d enumerator example iterFoo <- enumFile "foo.txt" iterWhatever iterBar <- enumFile "bar.txt" iterFoo : \subsection{Enumeratees} In cases where an enumerator acts as both a source and sink, the resulting type is named an {\tt Enumeratee}. Enumeratees have two input types, ``outer a'' ({\tt ao}) and ``inner a'' ({\tt ai}). Enumeratees are encoded as an iteratee stack. The outer iteratee reads from a stream of \emph{ao} values, transforms them into \emph{ai}, and passes them to an inner iteratee. This model allows a single outer input to generate many inner inputs, and vice-versa. :f Data/Enumerator.hs |apidoc Data.Enumerator.Enumeratee| type Enumeratee ao ai m b = Step ai m b -> Iteratee ao m (Step ai m b) : \subsection{Operators} Because {\tt Iteratee a m b} is semantically equivalent to {\tt m (Step a m b)}, several of the monadic combinators ({\tt (>>=)}, {\tt (>=>)}, etc) are useful to save typing when constructing enumerators and enumeratees. {\tt (>>==)} corresponds to {\tt (>>=)}, {\tt (>==>)} to {\tt (>=>)}, and so on. For compatibility, {\tt (==<<)} is aliased to {\tt (\$\$)}. :f Data/Enumerator.hs infixl 1 >>== |apidoc Data.Enumerator.(>>==)| (>>==) :: Monad m => Iteratee a m b -> (Step a m b -> Iteratee a' m b') -> Iteratee a' m b' i >>== f = Iteratee (runIteratee i >>= runIteratee . f) : :f Data/Enumerator.hs infixr 1 ==<< |apidoc Data.Enumerator.(==<<)| (==<<) :: Monad m => (Step a m b -> Iteratee a' m b') -> Iteratee a m b -> Iteratee a' m b' (==<<) = flip (>>==) : :f Data/Enumerator.hs infixr 0 $$ |apidoc Data.Enumerator.($$)| ($$) :: Monad m => (Step a m b -> Iteratee a' m b') -> Iteratee a m b -> Iteratee a' m b' ($$) = (==<<) : :f Data/Enumerator.hs infixr 1 >==> |apidoc Data.Enumerator.(>==>)| (>==>) :: Monad m => Enumerator a m b -> (Step a m b -> Iteratee a' m b') -> Step a m b -> Iteratee a' m b' (>==>) e1 e2 s = e1 s >>== e2 : :f Data/Enumerator.hs infixr 1 <==< |apidoc Data.Enumerator.(<==<)| (<==<) :: Monad m => (Step a m b -> Iteratee a' m b') -> Enumerator a m b -> Step a m b -> Iteratee a' m b' (<==<) = flip (>==>) : :d Data.Enumerator exports -- ** Operators , (>>==) , (==<<) , ($$) , (>==>) , (<==<) : \subsection{Iteratees as Monads} Iteratees are monads; by sequencing iteratees, very complex processing may be applied to arbitrary input streams. Iteratees are also applicative functors and monad transformers. :f Data/Enumerator.hs instance Monad m => Monad (Iteratee a m) where return x = yield x (Chunks []) m >>= f = Iteratee $ runIteratee m >>= \r1 -> case r1 of Continue k -> return (Continue ((>>= f) . k)) Error err -> return (Error err) Yield x (Chunks []) -> runIteratee (f x) Yield x extra -> runIteratee (f x) >>= \r2 -> case r2 of Continue k -> runIteratee (k extra) Error err -> return (Error err) Yield x' _ -> return (Yield x' extra) : Most iteratees are used to wrap \io{} operations, so it's sensible to define instances for typeclasses from {\tt transformers}. :d Data.Enumerator imports import Control.Monad.Trans.Class (MonadTrans, lift) import Control.Monad.IO.Class (MonadIO, liftIO) : :f Data/Enumerator.hs instance MonadTrans (Iteratee a) where lift m = Iteratee (m >>= runIteratee . return) instance MonadIO m => MonadIO (Iteratee a m) where liftIO = lift . liftIO : It's probably possible to define {\tt Functor} and {\tt Applicative} instances for {\tt Iteratee} without a {\tt Monad} constraint, but I haven't bothered, since every useful operation requires {\tt m} to be a Monad anyway. :d Data.Enumerator imports import qualified Control.Applicative as A import qualified Control.Monad as CM : :f Data/Enumerator.hs instance Monad m => Functor (Iteratee a m) where fmap = CM.liftM : :f Data/Enumerator.hs instance Monad m => A.Applicative (Iteratee a m) where pure = return (<*>) = CM.ap :