pipes-parse-2.0.2: Parsing infrastructure for the pipes ecosystem

Safe HaskellNone

Pipes.Parse

Contents

Description

Element-agnostic parsing utilities for pipes

pipes-parse provides two ways to parse and transform streams in constant space:

  • The "list-like" approach, using the split / transform / join paradigm
  • The monadic approach, using parser combinators

The top half of this module provides the list-like approach, which is easier to use, but less powerful. The key idea is that:

 -- '~' means "is analogous to"
 Producer a m ()            ~   [a]

 FreeT (Producer a m) m ()  ~  [[a]]

FreeT nests each subsequent Producer within the return value of the previous Producer so that you cannot access the next Producer until you completely drain the current Producer. However, you rarely need to work with FreeT directly. Instead, you structure everything using "splitters", "transformations" and "joiners":

 -- A "splitter"
 Producer a m ()           -> FreeT (Producer a m) m ()  ~   [a]  -> [[a]]

 -- A "transformation"
 FreeT (Producer a m) m () -> FreeT (Producer a m) m ()  ~  [[a]] -> [[a]]

 -- A "joiner"
 FreeT (Producer a m) m () -> Producer a m ()            ~  [[a]] ->  [a]

For example, if you wanted to group standard input by equal lines and take the first three groups, you would write:

 import Pipes
 import qualified Pipes.Parse as Parse
 import qualified Pipes.Prelude as Prelude

 threeGroups :: (Monad m, Eq a) => Producer a m () -> Producer a m ()
 threeGroups = Parse.concat . Parse.takeFree 3 . Parse.groupBy (==)
 --            ^ Joiner       ^ Transformation   ^ Splitter

This then limits standard input to the first three consecutive groups of equal lines:

>>> runEffect $ threeGroups Prelude.stdinLn >-> Prelude.stdoutLn
Group1<Enter>
Group1
Group1<Enter>
Group1
Group2<Enter>
Group2
Group3<Enter>
Group3
Group3<Enter>
Group3
Group4<Enter>
>>> -- Done, because we began entering our fourth group

The advantage of this style or programming is that you never bring more than a single element into memory. This works because FreeT sub-divides the Producer without concatenating elements together, preserving the laziness of the underlying Producer.

The bottom half of this module lets you implement your own list-like transformations using monadic parsers.

For example, if you wanted to repeatedly sum every 3 elements and yield the result, you would write:

 import Control.Monad (unless)
 import Pipes
 import qualified Pipes.Prelude as P
 import Pipes.Parse

 sum3 :: (Monad m, Num a) => Producer a (StateT (Producer a m ()) m) ()
 sum3 = do
     eof <- lift isEndOfInput
     unless eof $ do
         n <- lift $ P.sum (input >-> P.take 3)
         yield n
         sum3

When you are done building the parser, you convert your parser to a list-like function using evalStateP:

 import Pipes.Lift (evalStateP)

 -- sum3'  ~  (Num a) => [a] -> [a]

 sum3' :: (Monad m, Num a) => Producer a m () -> Producer a m ()
 sum3' p = evalStateP p sum3

... then apply it to the Producer you want to transform:

>>> runEffect $ sum3' (P.readLn >-> P.takeWhile (/= 0)) >-> P.print
1<Enter>
4<Enter>
5<Enter>
10
2<Enter>
0<Enter>
2
>>> 

Synopsis

Splitters

groupBy :: Monad m => (a -> a -> Bool) -> Producer a m r -> FreeT (Producer a m) m rSource

Split a Producer into a FreeT-delimited stream of Producers grouped by the supplied equality predicate

chunksOf :: Monad m => Int -> Producer a m r -> FreeT (Producer a m) m rSource

Split a Producer into a FreeT-delimited stream of Producers of the given chunk size

splitOn :: Monad m => (a -> Bool) -> Producer a m r -> FreeT (Producer a m) m rSource

Split a Producer into a FreeT-delimited stream of Producers separated by elements that satisfy the given predicate

Transformations

takeFree :: (Functor f, Monad m) => Int -> FreeT f m () -> FreeT f m ()Source

(takeFree n) only keeps the first n functor layers of a FreeT

dropFree :: Monad m => Int -> FreeT (Producer a m) m r -> FreeT (Producer a m) m rSource

(dropFree n) peels off the first n layers of a FreeT

Use carefully: the peeling off is not free. This runs the first n layers, just discarding everything they produce.

Joiners

concat :: Monad m => FreeT (Producer a m) m r -> Producer a m rSource

Join a FreeT-delimited stream of Producers into a single Producer

intercalate :: Monad m => Producer a m () -> FreeT (Producer a m) m r -> Producer a m rSource

Join a FreeT-delimited stream of Producers into a single Producer by intercalating a Producer in between them

Low-level Parsers

pipes-parse handles end-of-input and pushback by storing a Producer in a StateT layer.

draw :: Monad m => StateT (Producer a m r) m (Either r a)Source

Draw one element from the underlying Producer, returning Left if the Producer is empty

unDraw :: Monad m => a -> StateT (Producer a m r) m ()Source

Push back an element onto the underlying Producer

peek :: Monad m => StateT (Producer a m r) m (Either r a)Source

peek checks the first element of the stream, but uses unDraw to push the element back so that it is available for the next draw command.

 peek = do
     x <- draw
     case x of
         Left  _ -> return ()
         Right a -> unDraw a
     return x

isEndOfInput :: Monad m => StateT (Producer a m r) m BoolSource

Check if the underlying Producer is empty

 isEndOfInput = liftM isLeft peek

High-level Parsers

input provides a Producer that streams from the underlying Producer.

Streaming from input differs from streaming directly from the underlying Producer because any unused input is saved for later, as the following example illustrates:

 import Control.Monad.Trans.State.Strict
 import Pipes
 import Pipes.Parse
 import qualified Pipes.Prelude as P

 parser :: (Show a) => StateT (Producer a IO ()) IO ()
 parser = do
     runEffect $ input >-> P.take 2 >-> P.show >-> P.stdoutLn

     liftIO $ putStrLn "Intermission"

     runEffect $ input >-> P.take 2 >-> P.show >-> P.stdoutLn

The second pipeline resumes where the first pipeline left off:

>>> evalStateT parser (each [1..])
1
2
Intermission
3
4

You can see more examples of how to use these parsing utilities by studying the source code for the above splitters.

input :: Monad m => Producer' a (StateT (Producer a m r) m) rSource

Stream from the underlying Producer

input terminates if the Producer is empty, returning the final return value of the Producer.

Utilities

takeWhile :: Monad m => (a -> Bool) -> Pipe a a (StateT (Producer a m r) m) ()Source

A variation on takeWhile from Pipes.Prelude that unDraws the first element that does not match

Re-exports

Control.Monad.Trans.Free re-exports FreeF, FreeT, and runFreeT.

Control.Monad.Trans.State.Strict re-exports StateT, runStateT, evalStateT, and execStateT.