Safe Haskell | None |
---|
Let's start off with a few simple examples of conduit
usage. First, a file
copy utility:
>>>
:load Data.Conduit.Binary
>>>
runResourceT $ sourceFile "input.txt" $$ sinkFile "output.txt"
runResourceT
is a function provided by the resourcet
package, and ensures
that resources are properly cleaned up, even in the presence of exceptions. The
type system will enforce that runResourceT
is called as needed. The remainder
of this tutorial will not discuss runResourceT
; please see the documentation
in resourcet
for more information.
Looking at the rest of our example, there are three components to understand:
sourceFile
, sinkFile
, and the $$
operator (called "connect"). These
represent the most basic building blocks in conduit
: a Source
produces a
stream of values, a Sink
consumes such a stream, and $$
will combine these
together.
In the case of file copying, there was no value produced by the Sink
.
However, often times a Sink
will produce some result value. For example:
>>>
:load Data.Conduit.List
>>>
:module +Prelude
>>>
sourceList [1..10] $$ fold (+) 0
55
sourceList
is a convenience function for turning a list into a Source
.
fold
implements a strict left fold for consuming the input stream.
The next major aspect to the conduit
library is the Conduit
type.
This type represents a stream transformer. In order to use a Conduit
, we
must fuse it with either a Source
or Sink
. For example:
>>>
:load Data.Conduit.List
>>>
:module +Prelude
>>>
sourceList [1..10] $= Data.Conduit.List.map (+1) $$ consume
[2,3,4,5,6,7,8,9,10,11]
Notice the addition of the $=
, or left fuse operator. This combines a
Source
and a Conduit
into a new Source
, which can then be connected to a
Sink
(in this case, consume
). We can similarly perform right fusion to
combine a Conduit
and Sink
, or middle fusion to combine two Conduit
s.
A number of very common functions are provided in the Data.Conduit.List module. Many of these functions correspond very closely to standard Haskell functions.
In addition to connecting and fusing components together, we can also build up
more sophisticated components through monadic composition. For example, to
create a Sink
that ignores the first 3 numbers and returns the sum of the
remaining numbers, we can use:
>>>
:load Data.Conduit.List
>>>
:module +Prelude
>>>
sourceList [1..10] $$ Data.Conduit.List.drop 3 >> fold (+) 0
49
In some cases, we might end up consuming more input than we needed, and want to
provide that input to the next component in our monadic chain. We refer to this
as leftovers. The simplest example of this is peek
.
>>>
:load Data.Conduit.List
>>>
:set -XNoMonomorphismRestriction
>>>
:module +Prelude
>>>
let sink = do { first <- peek; total <- fold (+) 0; return (first, total) }
>>>
sourceList [1..10] $$ sink
(Just 1,55)
Notice that, although we "consumed" the first value from the stream via
peek
, it was still available to fold
. This idea becomes even more important
when dealing with chunked data such as ByteString
s or Text
.
Final note: Notice in the types below that Source
, Sink
, and Conduit
are just type aliases. This will be explained later. Another important aspect
is resource finalization, which will also be covered below.
- type Source m o = ConduitM () o m ()
- type Conduit i m o = ConduitM i o m ()
- type Sink i m r = ConduitM i Void m r
- ($$) :: Monad m => Source m a -> Sink a m b -> m b
- ($=) :: Monad m => Source m a -> Conduit a m b -> Source m b
- (=$) :: Monad m => Conduit a m b -> Sink b m c -> Sink a m c
- (=$=) :: Monad m => Conduit a m b -> ConduitM b c m r -> ConduitM a c m r
- await :: Monad m => Consumer i m (Maybe i)
- awaitForever :: Monad m => (i -> ConduitM i o m r) -> ConduitM i o m ()
- yield :: Monad m => o -> ConduitM i o m ()
- yieldOr :: Monad m => o -> m () -> ConduitM i o m ()
- leftover :: i -> ConduitM i o m ()
- bracketP :: MonadResource m => IO a -> (a -> IO ()) -> (a -> ConduitM i o m r) -> ConduitM i o m r
- addCleanup :: Monad m => (Bool -> m ()) -> ConduitM i o m r -> ConduitM i o m r
- data ResumableSource m o
- ($$+) :: Monad m => Source m a -> Sink a m b -> m (ResumableSource m a, b)
- ($$++) :: Monad m => ResumableSource m a -> Sink a m b -> m (ResumableSource m a, b)
- ($$+-) :: Monad m => ResumableSource m a -> Sink a m b -> m b
- unwrapResumable :: MonadIO m => ResumableSource m o -> m (Source m o, m ())
- transPipe :: Monad m => (forall a. m a -> n a) -> ConduitM i o m r -> ConduitM i o n r
- mapOutput :: Monad m => (o1 -> o2) -> ConduitM i o1 m r -> ConduitM i o2 m r
- mapOutputMaybe :: Monad m => (o1 -> Maybe o2) -> ConduitM i o1 m r -> ConduitM i o2 m r
- mapInput :: Monad m => (i1 -> i2) -> (i2 -> Maybe i1) -> ConduitM i2 o m r -> ConduitM i1 o m r
- type Producer m o = forall i. ConduitM i o m ()
- type Consumer i m r = forall o. ConduitM i o m r
- toProducer :: Monad m => Source m a -> Producer m a
- toConsumer :: Monad m => Sink a m b -> Consumer a m b
- data Flush a
- data ResourceT m a
- class (MonadThrow m, MonadUnsafeIO m, MonadIO m, Applicative m) => MonadResource m
- class Monad m => MonadThrow m where
- monadThrow :: Exception e => e -> m a
- class Monad m => MonadUnsafeIO m where
- unsafeLiftIO :: IO a -> m a
- runResourceT :: MonadBaseControl IO m => ResourceT m a -> m a
- newtype ExceptionT m a = ExceptionT {
- runExceptionT :: m (Either SomeException a)
- runExceptionT_ :: Monad m => ExceptionT m a -> m a
- runException :: ExceptionT Identity a -> Either SomeException a
- runException_ :: ExceptionT Identity a -> a
- class MonadBase b m => MonadBaseControl b m | m -> b
Conduit interface
type Source m o = ConduitM () o m ()Source
Provides a stream of output values, without consuming any input or producing a final result.
Since 0.5.0
type Conduit i m o = ConduitM i o m ()Source
Consumes a stream of input values and produces a stream of output values, without producing a final result.
Since 0.5.0
type Sink i m r = ConduitM i Void m rSource
Consumes a stream of input values and produces a final result, without producing any output.
Since 0.5.0
Connect/fuse
It is important to understand the lifecycle of our components. Notice that we can connect or fuse two components together. When we do this, the component providing output is called upstream, and the component consuming this input is called downstream. We can have arbitrarily long chains of such fusion, so a single component can simultaneously function as upstream and downstream.
Each component can be in one of four states of operation at any given time:
- It hasn't yet started operating.
- It is providing output downstream.
- It is waiting for input from upstream.
- It has completed processing.
Let's use sourceFile
and sinkFile
as an example. When we run sourceFile
input $$ sinkFile output
, both components begin in the "not started"
state. Next, we start running sinkFile
(note: we always begin processing on
the downstream component). sinkFile
will open up the file, and then wait for
input from upstream.
Next, we'll start running sourceFile
, which will open the file, read some
data from it, and provide it as output downstream. This will be fed to
sinkFile
(which was already waiting). sinkFile
will write the data to a
file, then ask for more input. This process will continue until sourceFile
reaches the end of the input. It will close the file handle and switch to the
completed state. When this happens, sinkFile
is sent a signal that no more
input is available. It will then close its file and return a result.
Now let's change things up a bit. Suppose we were instead connecting
sourceFile
to take 1
. We start by running take 1
, which will wait for
some input. We'll then start sourceFile
, which will open the file, read a
chunk, and send it downstream. take 1
will take that single chunk and return
it as a result. Once it does this, it has transitioned to the complete state.
We don't want to pull any more data from sourceFile
, as we do not need it.
So instead, we call sourceFile
's finalizer. Each time upstream provides
output, it also provides a finalizer to be run if downstream finishes
processing.
One final case: suppose we connect sourceFile
to return ()
. The latter does
nothing: it immediately switches to the complete state. In this case, we never
even start running sourceFile
(it stays in the "not yet started" state),
and so no finalization occurs.
So here are the takeaways from the above discussion:
- When upstream completes before downstream, it cleans up all of its resources and sends some termination signal. We never think about upstream again. This can only occur while downstream is in the "waiting for input" state, since that is the only time that upstream is called.
- When downstream completes before upstream, we finalize upstream immediately. This can only occur when upstream produces output, because that's the only time when control is passed back to downstream.
- If downstream never awaits for input before it terminates, upstream was never started, and therefore it does not need to be finalized.
Note that all of the discussion above applies equally well to chains of components. If you have an upstream, middle, and downstream component, and downstream terminates, then the middle component will be finalized, which in turn will trigger upstream to be finalized. This setup ensures that we always have prompt resource finalization.
($$) :: Monad m => Source m a -> Sink a m b -> m bSource
The connect operator, which pulls data from a source and pushes to a sink.
When either side closes, the other side will immediately be closed as well.
If you would like to keep the Source
open to be used for another
operations, use the connect-and-resume operator $$+
.
Since 0.4.0
($=) :: Monad m => Source m a -> Conduit a m b -> Source m bSource
Left fuse, combining a source and a conduit together into a new source.
Both the Source
and Conduit
will be closed when the newly-created
Source
is closed.
Leftover data from the Conduit
will be discarded.
Since 0.4.0
(=$) :: Monad m => Conduit a m b -> Sink b m c -> Sink a m cSource
Right fuse, combining a conduit and a sink together into a new sink.
Both the Conduit
and Sink
will be closed when the newly-created Sink
is closed.
Leftover data returned from the Sink
will be discarded.
Since 0.4.0
(=$=) :: Monad m => Conduit a m b -> ConduitM b c m r -> ConduitM a c m rSource
Fusion operator, combining two Conduit
s together into a new Conduit
.
Both Conduit
s will be closed when the newly-created Conduit
is closed.
Leftover data returned from the right Conduit
will be discarded.
Since 0.4.0
Primitives
While conduit
provides a number of built-in Source
s, Sink
s, and
Conduit
s, you will almost certainly want to construct some of your own.
Previous versions recommended using the constructors directly. Beginning with
0.5, the recommended approach is to compose existing Pipe
s into larger ones.
It is certainly possible (and advisable!) to leverage existing Pipe
s- like
those in Data.Conduit.List. However, you will often need to go to a lower
level set of Pipe
s to start your composition. The following few functions
should be sufficient for expressing all constructs besides finalization. Adding
in bracketP
and addCleanup
, you should be able to create any Pipe
you
need. (In fact, that's precisely how the remainder of this package is written.)
The three basic operations are awaiting, yielding, and leftovers.
Awaiting asks for a new value from upstream, or returns Nothing
if upstream
is done. For example:
>>>
:load Data.Conduit.List
>>>
sourceList [1..10] $$ await
Just 1
>>>
:load Data.Conduit.List
>>>
sourceList [] $$ await
Nothing
Similarly, we have a yield
function, which provides a value to the downstream
Pipe
. yield
features auto-termination: if the downstream Pipe
has
already completed processing, the upstream Pipe
will stop processing when it
tries to yield
.
The upshot of this is that you can write code that appears to loop infinitely, and yet will terminate.
>>>
:set -XNoMonomorphismRestriction
>>>
let infinite = yield () >> infinite
>>>
infinite $$ await
Just ()
Or for something a bit more sophisticated:
>>>
let enumFrom' i = yield i >> enumFrom' (succ i)
>>>
enumFrom' 1 $$ take 5
[1,2,3,4,5]
The final primitive Pipe
is leftover
. This allows you to return unused
input to be used by the next Pipe
in the monadic chain. A simple use case
would be implementing the peek
function:
>>>
let peek = await >>= maybe (return Nothing) (\x -> leftover x >> return (Just x))
>>>
enumFrom' 1 $$ do { mx <- peek; my <- await; mz <- await; return (mx, my, mz) }
(Just 1,Just 1,Just 2)
Note that you should only return leftovers that were previously yielded from upstream.
awaitForever :: Monad m => (i -> ConduitM i o m r) -> ConduitM i o m ()Source
Finalization
bracketP :: MonadResource m => IO a -> (a -> IO ()) -> (a -> ConduitM i o m r) -> ConduitM i o m rSource
Connect-and-resume
Sometimes, we do not want to force our entire application to live inside the
Pipe
monad. It can be convenient to keep normal control flow of our program,
and incrementally apply data from a Source
to various Sink
s. A strong
motivating example for this use case is interleaving multiple Source
s, such
as combining a conduit
-powered HTTP server and client into an HTTP proxy.
Normally, when we run a Pipe
, we get a result and can never run it again.
Connect-and-resume allows us to connect a Source
to a Sink
until the latter
completes, and then return the current state of the Source
to be applied
later. To do so, we introduce three new operators. Let' start off by
demonstrating them:
>>>
:load Data.Conduit.List
>>>
(next, x) <- sourceList [1..10] $$+ take 5
>>>
Prelude.print x
[1,2,3,4,5]>>>
(next, y) <- next $$++ (isolate 4 =$ fold (Prelude.+) 0)
>>>
Prelude.print y
30>>>
next $$+- consume
[10]
data ResumableSource m o Source
A Source
which has been started, but has not yet completed.
This type contains both the current state of the Source
, and the finalizer
to be run to close it.
Since 0.5.0
($$+) :: Monad m => Source m a -> Sink a m b -> m (ResumableSource m a, b)Source
The connect-and-resume operator. This does not close the Source
, but
instead returns it to be used again. This allows a Source
to be used
incrementally in a large program, without forcing the entire program to live
in the Sink
monad.
Mnemonic: connect + do more.
Since 0.5.0
($$++) :: Monad m => ResumableSource m a -> Sink a m b -> m (ResumableSource m a, b)Source
Continue processing after usage of $$+
.
Since 0.5.0
($$+-) :: Monad m => ResumableSource m a -> Sink a m b -> m bSource
Complete processing of a ResumableSource
. This will run the finalizer
associated with the ResumableSource
. In order to guarantee process resource
finalization, you must use this operator after using $$+
and $$++
.
Since 0.5.0
unwrapResumable :: MonadIO m => ResumableSource m o -> m (Source m o, m ())Source
Unwraps a ResumableSource
into a Source
and a finalizer.
A ResumableSource
represents a Source
which has already been run, and
therefore has a finalizer registered. As a result, if we want to turn it
into a regular Source
, we need to ensure that the finalizer will be run
appropriately. By appropriately, I mean:
- If a new finalizer is registered, the old one should not be called. * If the old one is called, it should not be called again.
This function returns both a Source
and a finalizer which ensures that the
above two conditions hold. Once you call that finalizer, the Source
is
invalidated and cannot be used.
Since 0.5.2
Utility functions
Generalized conduit types
It's recommended to keep your type signatures as general as possible to
encourage reuse. For example, a theoretical signature for the head
function
would be:
head :: Sink a m (Maybe a)
However, doing so would prevent usage of head
from inside a Conduit
, since
a Sink
sets its output type parameter to Void
. The most general type
signature would instead be:
head :: Pipe l a o u m (Maybe a)
However, that signature is much more confusing. To bridge this gap, we also provide some generalized conduit types. They follow a simple naming convention:
- They have the same name as their non-generalized types, with a
G
prepended. - If they have leftovers, we add an
L
. - If they consume the entirety of their input stream and return the upstream
result, we add
Inf
to indicate infinite consumption.
type Consumer i m r = forall o. ConduitM i o m rSource
Generalized Sink
without leftovers.
Since 0.5.0
toProducer :: Monad m => Source m a -> Producer m aSource
toConsumer :: Monad m => Sink a m b -> Consumer a m bSource
Flushing
Provide for a stream of data that can be flushed.
A number of Conduit
s (e.g., zlib compression) need the ability to flush
the stream at some point. This provides a single wrapper datatype to be used
in all such circumstances.
Since 0.3.0
Convenience re-exports
data ResourceT m a
The Resource transformer. This transformer keeps track of all registered
actions, and calls them upon exit (via runResourceT
). Actions may be
registered via register
, or resources may be allocated atomically via
allocate
. allocate
corresponds closely to bracket
.
Releasing may be performed before exit via the release
function. This is a
highly recommended optimization, as it will ensure that scarce resources are
freed early. Note that calling release
will deregister the action, so that
a release action will only ever be called once.
Since 0.3.0
MonadTrans ResourceT | |
MonadTransControl ResourceT | |
MonadRWS r w s m => MonadRWS r w s (ResourceT m) | |
MonadBase b m => MonadBase b (ResourceT m) | |
MonadBaseControl b m => MonadBaseControl b (ResourceT m) | |
MonadError e m => MonadError e (ResourceT m) | |
MonadReader r m => MonadReader r (ResourceT m) | |
MonadState s m => MonadState s (ResourceT m) | |
MonadWriter w m => MonadWriter w (ResourceT m) | |
Monad m => Monad (ResourceT m) | |
Functor m => Functor (ResourceT m) | |
Typeable1 m => Typeable1 (ResourceT m) | |
Applicative m => Applicative (ResourceT m) | |
MonadIO m => MonadIO (ResourceT m) | |
MonadCont m => MonadCont (ResourceT m) | |
(MonadThrow m, MonadUnsafeIO m, MonadIO m, Applicative m) => MonadResource (ResourceT m) | |
MonadThrow m => MonadThrow (ResourceT m) | |
(MonadIO m, MonadActive m) => MonadActive (ResourceT m) |
class (MonadThrow m, MonadUnsafeIO m, MonadIO m, Applicative m) => MonadResource m
A Monad
which allows for safe resource allocation. In theory, any monad
transformer stack included a ResourceT
can be an instance of
MonadResource
.
Note: runResourceT
has a requirement for a MonadBaseControl IO m
monad,
which allows control operations to be lifted. A MonadResource
does not
have this requirement. This means that transformers such as ContT
can be
an instance of MonadResource
. However, the ContT
wrapper will need to be
unwrapped before calling runResourceT
.
Since 0.3.0
MonadResource m => MonadResource (ListT m) | |
(MonadThrow m, MonadUnsafeIO m, MonadIO m, Applicative m) => MonadResource (ResourceT m) | |
MonadResource m => MonadResource (MaybeT m) | |
MonadResource m => MonadResource (IdentityT m) | |
MonadResource m => MonadResource (ContT r m) | |
(Error e, MonadResource m) => MonadResource (ErrorT e m) | |
MonadResource m => MonadResource (ReaderT r m) | |
MonadResource m => MonadResource (StateT s m) | |
MonadResource m => MonadResource (StateT s m) | |
(Monoid w, MonadResource m) => MonadResource (WriterT w m) | |
(Monoid w, MonadResource m) => MonadResource (WriterT w m) | |
MonadResource m => MonadResource (ConduitM i o m) | |
(Monoid w, MonadResource m) => MonadResource (RWST r w s m) | |
(Monoid w, MonadResource m) => MonadResource (RWST r w s m) | |
MonadResource m => MonadResource (Pipe l i o u m) |
class Monad m => MonadThrow m where
A Monad
which can throw exceptions. Note that this does not work in a
vanilla ST
or Identity
monad. Instead, you should use the ExceptionT
transformer in your stack if you are dealing with a non-IO
base monad.
Since 0.3.0
monadThrow :: Exception e => e -> m a
MonadThrow [] | |
MonadThrow IO | |
MonadThrow Maybe | |
MonadThrow (Either SomeException) | |
MonadThrow m => MonadThrow (ListT m) | |
MonadThrow m => MonadThrow (ResourceT m) | |
Monad m => MonadThrow (ExceptionT m) | |
MonadThrow m => MonadThrow (MaybeT m) | |
MonadThrow m => MonadThrow (IdentityT m) | |
MonadThrow m => MonadThrow (ContT r m) | |
(Error e, MonadThrow m) => MonadThrow (ErrorT e m) | |
MonadThrow m => MonadThrow (ReaderT r m) | |
MonadThrow m => MonadThrow (StateT s m) | |
MonadThrow m => MonadThrow (StateT s m) | |
(Monoid w, MonadThrow m) => MonadThrow (WriterT w m) | |
(Monoid w, MonadThrow m) => MonadThrow (WriterT w m) | |
MonadThrow m => MonadThrow (ConduitM i o m) | |
(Monoid w, MonadThrow m) => MonadThrow (RWST r w s m) | |
(Monoid w, MonadThrow m) => MonadThrow (RWST r w s m) | |
MonadThrow m => MonadThrow (Pipe l i o u m) |
class Monad m => MonadUnsafeIO m where
A Monad
based on some monad which allows running of some IO
actions,
via unsafe calls. This applies to IO
and ST
, for instance.
Since 0.3.0
unsafeLiftIO :: IO a -> m a
MonadUnsafeIO IO | |
(MonadTrans t, MonadUnsafeIO m, Monad (t m)) => MonadUnsafeIO (t m) | |
MonadUnsafeIO (ST s) | |
MonadUnsafeIO (ST s) |
runResourceT :: MonadBaseControl IO m => ResourceT m a -> m a
Unwrap a ResourceT
transformer, and call all registered release actions.
Note that there is some reference counting involved due to resourceForkIO
.
If multiple threads are sharing the same collection of resources, only the
last call to runResourceT
will deallocate the resources.
Since 0.3.0
newtype ExceptionT m a
The express purpose of this transformer is to allow non-IO
-based monad
stacks to catch exceptions via the MonadThrow
typeclass.
Since 0.3.0
ExceptionT | |
|
MonadTrans ExceptionT | |
MonadTransControl ExceptionT | |
MonadRWS r w s m => MonadRWS r w s (ExceptionT m) | |
MonadBase b m => MonadBase b (ExceptionT m) | |
MonadBaseControl b m => MonadBaseControl b (ExceptionT m) | |
MonadError e m => MonadError e (ExceptionT m) | |
MonadReader r m => MonadReader r (ExceptionT m) | |
MonadState s m => MonadState s (ExceptionT m) | |
MonadWriter w m => MonadWriter w (ExceptionT m) | |
Monad m => Monad (ExceptionT m) | |
Monad m => Functor (ExceptionT m) | |
Monad m => Applicative (ExceptionT m) | |
MonadCont m => MonadCont (ExceptionT m) | |
Monad m => MonadThrow (ExceptionT m) |
runExceptionT_ :: Monad m => ExceptionT m a -> m a
Same as runExceptionT
, but immediately throw
any exception returned.
Since 0.3.0
runException :: ExceptionT Identity a -> Either SomeException a
Run an ExceptionT Identity
stack.
Since 0.4.2
runException_ :: ExceptionT Identity a -> a
Run an ExceptionT Identity
stack, but immediately throw
any exception returned.
Since 0.4.2
class MonadBase b m => MonadBaseControl b m | m -> b