Copyright	(c) 2020 Composewell Technologies
License	BSD-3-Clause
Maintainer	streamly@composewell.com
Stability	pre-release
Portability	GHC
Safe Haskell	None
Language	Haskell2010

Streamly.Internal.Data.Parser

Contents

Downgrade to Fold
Accumulators
Element parsers
Sequence parsers
Binary Combinators
N-ary Combinators
Distribution

Description

Fast backtracking parsers with stream fusion and native streaming capability.

Applicative and Alternative type class based combinators from the parser-combinators package can also be used with the Parser type. However, there are two important differences between parser-combinators and the equivalent ones provided in this module in terms of performance:

1) parser-combinators use plain Haskell lists to collect the results, in a strict Monad like IO, the results are necessarily buffered before they can be consumed. This may not perform optimally in streaming applications processing large amounts of data. Equivalent combinators in this module can consume the results of parsing using a Fold, thus providing a scalability and a composable consumer.

2) Several combinators in this module can be many times faster because of stream fusion. For example, many combinator in this module is much faster than the many combinator of Alternative type class.

Errors

Failing parsers in this module throw the ParseError exception.

Naming

As far as possible, try that the names of the combinators in this module are consistent with:

Synopsis

newtype Parser m a b = MkParser {
- runParser :: forall r. Int -> (Int, Int) -> ((Int, Int) -> Parse b -> m (Driver m a r)) -> m (Driver m a r)
}
newtype ParseError = ParseError String
data Step s b
- = Partial Int s
- | Continue Int s
- | Done Int b
- | Error String
toFold :: MonadThrow m => Parser m a b -> Fold m a b
fromFold :: MonadCatch m => Fold m a b -> Parser m a b
fromPure :: MonadCatch m => b -> Parser m a b
fromEffect :: MonadCatch m => m b -> Parser m a b
die :: MonadCatch m => String -> Parser m a b
dieM :: MonadCatch m => m String -> Parser m a b
peek :: MonadCatch m => Parser m a a
eof :: MonadCatch m => Parser m a ()
satisfy :: MonadCatch m => (a -> Bool) -> Parser m a a
next :: MonadCatch m => Parser m a (Maybe a)
maybe :: MonadCatch m => (a -> Maybe b) -> Parser m a b
either :: MonadCatch m => (a -> Either String b) -> Parser m a b
takeBetween :: MonadCatch m => Int -> Int -> Fold m a b -> Parser m a b
takeEQ :: MonadCatch m => Int -> Fold m a b -> Parser m a b
takeGE :: MonadCatch m => Int -> Fold m a b -> Parser m a b
takeP :: MonadCatch m => Int -> Parser m a b -> Parser m a b
lookAhead :: MonadCatch m => Parser m a b -> Parser m a b
takeWhileP :: (a -> Bool) -> Parser m a b -> Parser m a b
takeWhile :: MonadCatch m => (a -> Bool) -> Fold m a b -> Parser m a b
takeWhile1 :: MonadCatch m => (a -> Bool) -> Fold m a b -> Parser m a b
drainWhile :: MonadCatch m => (a -> Bool) -> Parser m a ()
sliceSepByP :: MonadCatch m => (a -> Bool) -> Parser m a b -> Parser m a b
sliceBeginWith :: MonadCatch m => (a -> Bool) -> Fold m a b -> Parser m a b
sliceSepWith :: (a -> Bool) -> Fold m a b -> Parser m a b
escapedSliceSepBy :: (a -> Bool) -> (a -> Bool) -> Fold m a b -> Parser m a b
escapedFrameBy :: (a -> Bool) -> (a -> Bool) -> (a -> Bool) -> Fold m a b -> Parser m a b
wordBy :: MonadCatch m => (a -> Bool) -> Fold m a b -> Parser m a b
groupBy :: MonadCatch m => (a -> a -> Bool) -> Fold m a b -> Parser m a b
groupByRolling :: MonadCatch m => (a -> a -> Bool) -> Fold m a b -> Parser m a b
groupByRollingEither :: MonadCatch m => (a -> a -> Bool) -> Fold m a b -> Fold m a c -> Parser m a (Either b c)
eqBy :: MonadCatch m => (a -> a -> Bool) -> [a] -> Parser m a ()
serialWith :: MonadCatch m => (a -> b -> c) -> Parser m x a -> Parser m x b -> Parser m x c
split_ :: MonadCatch m => Parser m x a -> Parser m x b -> Parser m x b
teeWith :: MonadCatch m => (a -> b -> c) -> Parser m x a -> Parser m x b -> Parser m x c
teeWithFst :: MonadCatch m => (a -> b -> c) -> Parser m x a -> Parser m x b -> Parser m x c
teeWithMin :: MonadCatch m => (a -> b -> c) -> Parser m x a -> Parser m x b -> Parser m x c
deintercalate :: Fold m a y -> Parser m x a -> Fold m b z -> Parser m x b -> Parser m x (y, z)
alt :: MonadCatch m => Parser m x a -> Parser m x a -> Parser m x a
shortest :: MonadCatch m => Parser m x a -> Parser m x a -> Parser m x a
longest :: MonadCatch m => Parser m x a -> Parser m x a -> Parser m x a
concatSequence :: Fold m b c -> t (Parser m a b) -> Parser m a c
concatMap :: MonadCatch m => (b -> Parser m a c) -> Parser m a b -> Parser m a c
count :: Int -> Parser m a b -> Fold m b c -> Parser m a c
countBetween :: Int -> Int -> Parser m a b -> Fold m b c -> Parser m a c
manyP :: Parser m a b -> Parser m b c -> Parser m a c
many :: MonadCatch m => Parser m a b -> Fold m b c -> Parser m a c
some :: MonadCatch m => Parser m a b -> Fold m b c -> Parser m a c
manyTillP :: Parser m a b -> Parser m a x -> Parser m b c -> Parser m a c
manyTill :: MonadCatch m => Parser m a b -> Parser m a x -> Fold m b c -> Parser m a c
manyThen :: Parser m a b -> Parser m a x -> Fold m b c -> Parser m a c
roundRobin :: t (Parser m a b) -> Fold m b c -> Parser m a c
choice :: (Functor t, Foldable t, MonadCatch m) => t (Parser m a b) -> Parser m a b
retryMaxTotal :: Int -> Parser m a b -> Fold m b c -> Parser m a c
retryMaxSuccessive :: Int -> Parser m a b -> Fold m b c -> Parser m a c
retry :: Parser m a b -> Parser m a b

Documentation

newtype Parser m a b Source #

A continuation passing style parser representation.

Constructors

MkParser
Fields runParser :: forall r. Int -> (Int, Int) -> ((Int, Int) -> Parse b -> m (Driver m a r)) -> m (Driver m a r)

Instances

Instances details

(MonadThrow m, MonadState s m) => MonadState s (Parser m a) Source #
Instance details Defined in Streamly.Internal.Data.Parser.ParserK.Type Methods get :: Parser m a s # put :: s -> Parser m a () # state :: (s -> (a0, s)) -> Parser m a a0 #
(MonadThrow m, MonadReader r m, MonadCatch m) => MonadReader r (Parser m a) Source #
Instance details Defined in Streamly.Internal.Data.Parser.ParserK.Type Methods ask :: Parser m a r # local :: (r -> r) -> Parser m a a0 -> Parser m a a0 # reader :: (r -> a0) -> Parser m a a0 #
Monad m => Monad (Parser m a) Source #	Monad composition can be used for lookbehind parsers, we can make the future parses depend on the previously parsed values. If we have to parse "a9" or "9a" but not "99" or "aa" we can use the following parser: backtracking :: MonadCatch m => PR.Parser m Char String backtracking = sequence [PR.satisfy isDigit, PR.satisfy isAlpha] `<\|>` sequence [PR.satisfy isAlpha, PR.satisfy isDigit] We know that if the first parse resulted in a digit at the first place then the second parse is going to fail. However, we waste that information and parse the first character again in the second parse only to know that it is not an alphabetic char. By using lookbehind in a `Monad` composition we can avoid redundant work: data DigitOrAlpha = Digit Char \| Alpha Char lookbehind :: MonadCatch m => PR.Parser m Char String lookbehind = do x1 <- Digit `<$>` PR.satisfy isDigit `<\|>` Alpha `<$>` PR.satisfy isAlpha -- Note: the parse depends on what we parsed already x2 <- case x1 of Digit _ -> PR.satisfy isAlpha Alpha _ -> PR.satisfy isDigit return $ case x1 of Digit x -> [x,x2] Alpha x -> [x,x2] See also `concatMap`. This monad instance does not fuse, use `concatMap` when you need fusion.
Instance details Defined in Streamly.Internal.Data.Parser.ParserK.Type Methods (>>=) :: Parser m a a0 -> (a0 -> Parser m a b) -> Parser m a b # (>>) :: Parser m a a0 -> Parser m a b -> Parser m a b # return :: a0 -> Parser m a a0 #
Functor m => Functor (Parser m a) Source #	Maps a function over the output of the parser.
Instance details Defined in Streamly.Internal.Data.Parser.ParserK.Type Methods fmap :: (a0 -> b) -> Parser m a a0 -> Parser m a b # (<$) :: a0 -> Parser m a b -> Parser m a a0 #
Monad m => MonadFail (Parser m a) Source #
Instance details Defined in Streamly.Internal.Data.Parser.ParserK.Type Methods fail :: String -> Parser m a a0 #
Monad m => Applicative (Parser m a) Source #	`Applicative` form of `serialWith`. Note that this operation does not fuse, use `serialWith` when fusion is important.
Instance details Defined in Streamly.Internal.Data.Parser.ParserK.Type Methods pure :: a0 -> Parser m a a0 # (<>) :: Parser m a (a0 -> b) -> Parser m a a0 -> Parser m a b # liftA2 :: (a0 -> b -> c) -> Parser m a a0 -> Parser m a b -> Parser m a c # (>) :: Parser m a a0 -> Parser m a b -> Parser m a b # (<*) :: Parser m a a0 -> Parser m a b -> Parser m a a0 #
(MonadThrow m, MonadIO m) => MonadIO (Parser m a) Source #
Instance details Defined in Streamly.Internal.Data.Parser.ParserK.Type Methods liftIO :: IO a0 -> Parser m a a0 #
Monad m => Alternative (Parser m a) Source #	`Alternative` form of `alt`. Backtrack and run the second parser if the first one fails. The "some" and "many" operations of alternative accumulate results in a pure list which is not scalable and streaming. Instead use `some` and `many` for fusible operations with composable accumulation of results. See also `alt`. This `Alternative` instance does not fuse, use `alt` when you need fusion.
Instance details Defined in Streamly.Internal.Data.Parser.ParserK.Type Methods empty :: Parser m a a0 # (<\|>) :: Parser m a a0 -> Parser m a a0 -> Parser m a a0 # some :: Parser m a a0 -> Parser m a [a0] # many :: Parser m a a0 -> Parser m a [a0] #
Monad m => MonadPlus (Parser m a) Source #	`mzero` is same as `empty`, it aborts the parser. `mplus` is same as `<\|>`, it selects the first succeeding parser. Pre-release
Instance details Defined in Streamly.Internal.Data.Parser.ParserK.Type Methods mzero :: Parser m a a0 # mplus :: Parser m a a0 -> Parser m a a0 -> Parser m a a0 #

newtype ParseError Source #

This exception is used for two purposes:

When a parser ultimately fails, the user of the parser is intimated via this exception.
When the "extract" function of a parser needs to throw an error.

Pre-release

Constructors

ParseError String

Instances

Instances details

Show ParseError Source #
Instance details Defined in Streamly.Internal.Data.Parser.ParserD.Type Methods showsPrec :: Int -> ParseError -> ShowS # show :: ParseError -> String # showList :: [ParseError] -> ShowS #
Exception ParseError Source #
Instance details Defined in Streamly.Internal.Data.Parser.ParserD.Type Methods toException :: ParseError -> SomeException # fromException :: SomeException -> Maybe ParseError # displayException :: ParseError -> String #

data Step s b Source #

The return type of a Parser step.

The parse operation feeds the input stream to the parser one element at a time, representing a parse Step. The parser may or may not consume the item and returns a result. If the result is Partial we can either extract the result or feed more input to the parser. If the result is Continue, we must feed more input in order to get a result. If the parser returns Done then the parser can no longer take any more input.

If the result is Continue, the parse operation retains the input in a backtracking buffer, in case the parser may ask to backtrack in future. Whenever a 'Partial n' result is returned we first backtrack by n elements in the input and then release any remaining backtracking buffer. Similarly, 'Continue n' backtracks to n elements before the current position and starts feeding the input from that point for future invocations of the parser.

If parser is not yet done, we can use the extract operation on the state of the parser to extract a result. If the parser has not yet yielded a result, the operation fails with a ParseError exception. If the parser yielded a Partial result in the past the last partial result is returned. Therefore, if a parser yields a partial result once it cannot fail later on.

The parser can never backtrack beyond the position where the last partial result left it at. The parser must ensure that the backtrack position is always after that.

Pre-release

Constructors

Partial Int s	Partial result with an optional backtrack request. `Partial count state` means a partial result is available which can be extracted successfully, `state` is the opaque state of the parser to be supplied to the next invocation of the step operation. The current input position is reset to `count` elements back and any input before that is dropped from the backtrack buffer.
Continue Int s	Need more input with an optional backtrack request. `Continue count state` means the parser has consumed the current input but no new result is generated, `state` is the next state of the parser. The current input is retained in the backtrack buffer and the input position is reset to `count` elements back.
Done Int b	Done with leftover input count and result. `Done count result` means the parser has finished, it will accept no more input, last `count` elements from the input are unused and the result of the parser is in `result`.
Error String	Parser failed without generating any output. The parsing operation may backtrack to the beginning and try another alternative.

Instances

Instances details

Functor (Step s) Source #
Instance details Defined in Streamly.Internal.Data.Parser.ParserD.Type Methods fmap :: (a -> b) -> Step s a -> Step s b # (<$) :: a -> Step s b -> Step s a #

Downgrade to Fold

toFold :: MonadThrow m => Parser m a b -> Fold m a b Source #

Make a Fold from a Parser. The fold just throws an exception if the parser fails or tries to backtrack.

This can be useful in combinators that accept a Fold and we know that a Parser cannot fail or failure exception is acceptable as there is no way to recover.

Pre-release

Accumulators

fromFold :: MonadCatch m => Fold m a b -> Parser m a b Source #

Make a Parser from a Fold.

Pre-release

fromPure :: MonadCatch m => b -> Parser m a b Source #

A parser that always yields a pure value without consuming any input.

Pre-release

fromEffect :: MonadCatch m => m b -> Parser m a b Source #

A parser that always yields the result of an effectful action without consuming any input.

Pre-release

die :: MonadCatch m => String -> Parser m a b Source #

A parser that always fails with an error message without consuming any input.

Pre-release

dieM :: MonadCatch m => m String -> Parser m a b Source #

A parser that always fails with an effectful error message and without consuming any input.

Pre-release

Element parsers

peek :: MonadCatch m => Parser m a a Source #

Peek the head element of a stream, without consuming it. Fails if it encounters end of input.

>>> Stream.parse ((,) <$> Parser.peek <*> Parser.satisfy (> 0)) $ Stream.fromList [1]
(1,1)

peek = lookAhead (satisfy True)

Pre-release

eof :: MonadCatch m => Parser m a () Source #

Succeeds if we are at the end of input, fails otherwise.

>>> Stream.parse ((,) <$> Parser.satisfy (> 0) <*> Parser.eof) $ Stream.fromList [1]
(1,())

Pre-release

satisfy :: MonadCatch m => (a -> Bool) -> Parser m a a Source #

Returns the next element if it passes the predicate, fails otherwise.

>>> Stream.parse (Parser.satisfy (== 1)) $ Stream.fromList [1,0,1]
1

Pre-release

next :: MonadCatch m => Parser m a (Maybe a) Source #

Return the next element of the input. Returns Nothing on end of input. Also known as head.

Pre-release

maybe :: MonadCatch m => (a -> Maybe b) -> Parser m a b Source #

Map a Maybe returning function on the next element in the stream. The parser fails if the function returns Nothing otherwise returns the Just value.

Pre-release

either :: MonadCatch m => (a -> Either String b) -> Parser m a b Source #

Map an Either returning function on the next element in the stream. If the function returns 'Left err', the parser fails with the error message err otherwise returns the Right value.

Pre-release

Sequence parsers

Parsers chained in series, if one parser terminates the composition terminates.

Grab a sequence of input elements without inspecting them

takeBetween :: MonadCatch m => Int -> Int -> Fold m a b -> Parser m a b Source #

takeBetween m n takes a minimum of m and a maximum of n input elements and folds them using the supplied fold.

Stops after n elements. Fails if the stream ends before m elements could be taken.

Examples: -

>>> :{
  takeBetween' low high ls = Stream.parse prsr (Stream.fromList ls)
    where prsr = Parser.takeBetween low high Fold.toList
:}

>>> takeBetween' 2 4 [1, 2, 3, 4, 5]
[1,2,3,4]

>>> takeBetween' 2 4 [1, 2]
[1,2]

>>> takeBetween' 2 4 [1]
*** Exception: ParseError "takeBetween: Expecting alteast 2 elements, got 1"

>>> takeBetween' 0 0 [1, 2]
[]

>>> takeBetween' 0 1 []
[]

takeBetween is the most general take operation, other take operations can be defined in terms of takeBetween. For example:

take = takeBetween 0 n  -- equivalent of take
take1 = takeBetween 1 n -- equivalent of takeLE1
takeEQ = takeBetween n n
takeGE = takeBetween n maxBound

Pre-release

takeEQ :: MonadCatch m => Int -> Fold m a b -> Parser m a b Source #

Stops after taking exactly n input elements.

Stops - after consuming n elements.
Fails - if the stream or the collecting fold ends before it can collect exactly n elements.

>>> Stream.parse (Parser.takeEQ 4 Fold.toList) $ Stream.fromList [1,0,1]
*** Exception: ParseError "takeEQ: Expecting exactly 4 elements, input terminated on 3"

Pre-release

takeGE :: MonadCatch m => Int -> Fold m a b -> Parser m a b Source #

Take at least n input elements, but can collect more.

Stops - when the collecting fold stops.
Fails - if the stream or the collecting fold ends before producing n elements.

>>> Stream.parse (Parser.takeGE 4 Fold.toList) $ Stream.fromList [1,0,1]
*** Exception: ParseError "takeGE: Expecting at least 4 elements, input terminated on 3"

>>> Stream.parse (Parser.takeGE 4 Fold.toList) $ Stream.fromList [1,0,1,0,1]
[1,0,1,0,1]

Pre-release

takeP :: MonadCatch m => Int -> Parser m a b -> Parser m a b Source #

Takes at-most n input elements.

Stops - when the collecting parser stops.
Fails - when the collecting parser fails.

>>> Stream.parse (Parser.takeP 4 (Parser.takeEQ 2 Fold.toList)) $ Stream.fromList [1, 2, 3, 4, 5]
[1,2]

>>> Stream.parse (Parser.takeP 4 (Parser.takeEQ 5 Fold.toList)) $ Stream.fromList [1, 2, 3, 4, 5]
*** Exception: ParseError "takeEQ: Expecting exactly 5 elements, input terminated on 4"

Internal

lookAhead :: MonadCatch m => Parser m a b -> Parser m a b Source #

Run a parser without consuming the input.

Pre-release

takeWhileP :: (a -> Bool) -> Parser m a b -> Parser m a b Source #

Like takeWhile but uses a Parser instead of a Fold to collect the input. The combinator stops when the condition fails or if the collecting parser stops.

This is a generalized version of takeWhile, for example takeWhile1 can be implemented in terms of this:

takeWhile1 cond p = takeWhile cond (takeBetween 1 maxBound p)

Stops: when the condition fails or the collecting parser stops. Fails: when the collecting parser fails.

Unimplemented

takeWhile :: MonadCatch m => (a -> Bool) -> Fold m a b -> Parser m a b Source #

Collect stream elements until an element fails the predicate. The element on which the predicate fails is returned back to the input stream.

Stops - when the predicate fails or the collecting fold stops.
Fails - never.

>>> Stream.parse (Parser.takeWhile (== 0) Fold.toList) $ Stream.fromList [0,0,1,0,1]
[0,0]

We can implement a breakOn using takeWhile:

breakOn p = takeWhile (not p)

Pre-release

Note: This is called takeWhileP and munch in some parser libraries.

takeWhile1 :: MonadCatch m => (a -> Bool) -> Fold m a b -> Parser m a b Source #

Like takeWhile but takes at least one element otherwise fails.

Pre-release

drainWhile :: MonadCatch m => (a -> Bool) -> Parser m a () Source #

Drain the input as long as the predicate succeeds, running the effects and discarding the results.

This is also called skipWhile in some parsing libraries.

Pre-release

sliceSepByP :: MonadCatch m => (a -> Bool) -> Parser m a b -> Parser m a b Source #

sliceSepByP cond parser parses a slice of the input using parser until cond succeeds or the parser stops.

This is a generalized slicing parser which can be used to implement other parsers e.g.:

sliceSepByMax cond n p = sliceSepByP cond (take n p)
sliceSepByBetween cond m n p = sliceSepByP cond (takeBetween m n p)

Pre-release

sliceBeginWith :: MonadCatch m => (a -> Bool) -> Fold m a b -> Parser m a b Source #

Collect stream elements until an elements passes the predicate, return the last element on which the predicate succeeded back to the input stream. If the predicate succeeds on the first element itself then the parser does not terminate there. The succeeding element in the leading position is treated as a prefix separator which is kept in the output segment.

Stops - when the predicate succeeds in non-leading position.
Fails - never.

S.splitWithPrefix pred f = S.parseMany (PR.sliceBeginWith pred f)

Examples: -

>>> :{
 sliceBeginWithOdd ls = Stream.parse prsr (Stream.fromList ls)
     where prsr = Parser.sliceBeginWith odd Fold.toList
:}

>>> sliceBeginWithOdd [2, 4, 6, 3]
*** Exception: sliceBeginWith : slice begins with an element which fails the predicate
...

>>> sliceBeginWithOdd [3, 5, 7, 4]
[3]

>>> sliceBeginWithOdd [3, 4, 6, 8, 5]
[3,4,6,8]

>>> sliceBeginWithOdd []
[]

Pre-release

sliceSepWith :: (a -> Bool) -> Fold m a b -> Parser m a b Source #

Like sliceSepBy but does not drop the separator element, instead separator is emitted as a separate element in the output.

Unimplemented

escapedSliceSepBy :: (a -> Bool) -> (a -> Bool) -> Fold m a b -> Parser m a b Source #

Like sliceSepBy but the separator elements can be escaped using an escape char determined by the second predicate.

Unimplemented

escapedFrameBy :: (a -> Bool) -> (a -> Bool) -> (a -> Bool) -> Fold m a b -> Parser m a b Source #

escapedFrameBy begin end escape parses a string framed using begin and end as the frame begin and end marker elements and escape as an escaping element to escape the occurrence of the framing elements within the frame. Nested frames are allowed, but nesting is removed when parsing.

For example,

> Stream.parse (Parser.escapedFrameBy (== '{') (== '}') (== \\) Fold.toList) $ Stream.fromList "{hello}"
"hello"

> Stream.parse (Parser.escapedFrameBy (== '{') (== '}') (== \\) Fold.toList) $ Stream.fromList "{hello {world}}"
"hello world"

> Stream.parse (Parser.escapedFrameBy (== '{') (== '}') (== \\) Fold.toList) $ Stream.fromList "{hello \{world\}}"
"hello {world}"

> Stream.parse (Parser.escapedFrameBy (== '{') (== '}') (== \\) Fold.toList) $ Stream.fromList "{hello {world}"
ParseError "Unterminated '{'"

Unimplemented

wordBy :: MonadCatch m => (a -> Bool) -> Fold m a b -> Parser m a b Source #

Like splitOn but strips leading, trailing, and repeated separators. Therefore, ".a..b." having . as the separator would be parsed as ["a","b"]. In other words, its like parsing words from whitespace separated text.

Stops - when it finds a word separator after a non-word element
Fails - never.

S.wordsBy pred f = S.parseMany (PR.wordBy pred f)

groupBy :: MonadCatch m => (a -> a -> Bool) -> Fold m a b -> Parser m a b Source #

Given an input stream [a,b,c,...] and a comparison function cmp, the parser assigns the element a to the first group, then if a `cmp` b is True b is also assigned to the same group. If a `cmp` c is True then c is also assigned to the same group and so on. When the comparison fails the parser is terminated. Each group is folded using the Fold f and the result of the fold is the result of the parser.

Stops - when the comparison fails.
Fails - never.

>>> :{
 runGroupsBy eq =
     Stream.toList
         . Stream.parseMany (Parser.groupBy eq Fold.toList)
         . Stream.fromList
:}

>>> runGroupsBy (<) []
[]

>>> runGroupsBy (<) [1]
[[1]]

>>> runGroupsBy (<) [3, 5, 4, 1, 2, 0]
[[3,5,4],[1,2],[0]]

Pre-release

groupByRolling :: MonadCatch m => (a -> a -> Bool) -> Fold m a b -> Parser m a b Source #

Unlike groupBy this combinator performs a rolling comparison of two successive elements in the input stream. Assuming the input stream to the parser is [a,b,c,...] and the comparison function is cmp, the parser first assigns the element a to the first group, then if a `cmp` b is True b is also assigned to the same group. If b `cmp` c is True then c is also assigned to the same group and so on. When the comparison fails the parser is terminated. Each group is folded using the Fold f and the result of the fold is the result of the parser.

Stops - when the comparison fails.
Fails - never.

>>> :{
 runGroupsByRolling eq =
     Stream.toList
         . Stream.parseMany (Parser.groupByRolling eq Fold.toList)
         . Stream.fromList
:}

>>> runGroupsByRolling (<) []
[]

>>> runGroupsByRolling (<) [1]
[[1]]

>>> runGroupsByRolling (<) [3, 5, 4, 1, 2, 0]
[[3,5],[4],[1,2],[0]]

Pre-release

groupByRollingEither :: MonadCatch m => (a -> a -> Bool) -> Fold m a b -> Fold m a c -> Parser m a (Either b c) Source #

Like groupByRolling, but if the predicate is True then collects using the first fold as long as the predicate holds True, if the predicate is False collects using the second fold as long as it remains False. Returns Left for the first case and Right for the second case.

For example, if we want to detect sorted sequences in a stream, both ascending and descending cases we can use 'groupByRollingEither (<=) Fold.toList Fold.toList'.

Unimplemented

eqBy :: MonadCatch m => (a -> a -> Bool) -> [a] -> Parser m a () Source #

Match the given sequence of elements using the given comparison function.

>>> Stream.parse (Parser.eqBy (==) "string") $ Stream.fromList "string"

>>> Stream.parse (Parser.eqBy (==) "mismatch") $ Stream.fromList "match"
*** Exception: ParseError "eqBy: failed, yet to match 7 elements"

Pre-release

Unimplemented

, prefixOf -- match any prefix of a given string
, suffixOf -- match any suffix of a given string
, infixOf -- match any substring of a given string

Binary Combinators

Sequential Applicative

serialWith :: MonadCatch m => (a -> b -> c) -> Parser m x a -> Parser m x b -> Parser m x c Source #

Sequential parser application. Apply two parsers sequentially to an input stream. The input is provided to the first parser, when it is done the remaining input is provided to the second parser. If both the parsers succeed their outputs are combined using the supplied function. The operation fails if any of the parsers fail.

Note: This is a parsing dual of appending streams using serial, it splits the streams using two parsers and zips the results.

This implementation is strict in the second argument, therefore, the following will fail:

>>> Stream.parse (Parser.serialWith const (Parser.satisfy (> 0)) undefined) $ Stream.fromList [1]
*** Exception: Prelude.undefined
...

Compare with Applicative instance method <*>. This implementation allows stream fusion but has quadratic complexity. This can fuse with other operations and can be faster than Applicative instance for small number (less than 8) of compositions.

Many combinators can be expressed using serialWith and other parser primitives. Some common idioms are described below,

span :: (a -> Bool) -> Fold m a b -> Fold m a b -> Parser m a b
span pred f1 f2 = serialWith (,) (takeWhile pred f1) (fromFold f2)

spanBy :: (a -> a -> Bool) -> Fold m a b -> Fold m a b -> Parser m a b
spanBy eq f1 f2 = serialWith (,) (groupBy eq f1) (fromFold f2)

spanByRolling :: (a -> a -> Bool) -> Fold m a b -> Fold m a b -> Parser m a b
spanByRolling eq f1 f2 = serialWith (,) (groupByRolling eq f1) (fromFold f2)

Pre-release

split_ :: MonadCatch m => Parser m x a -> Parser m x b -> Parser m x b Source #

Sequential parser application ignoring the output of the first parser. Apply two parsers sequentially to an input stream. The input is provided to the first parser, when it is done the remaining input is provided to the second parser. The output of the parser is the output of the second parser. The operation fails if any of the parsers fail.

This implementation is strict in the second argument, therefore, the following will fail:

>>> Stream.parse (Parser.split_ (Parser.satisfy (> 0)) undefined) $ Stream.fromList [1]
*** Exception: Prelude.undefined
...

Compare with Applicative instance method *>. This implementation allows stream fusion but has quadratic complexity. This can fuse with other operations, and can be faster than Applicative instance for small number (less than 8) of compositions.

Pre-release

Parallel Applicatives

teeWith :: MonadCatch m => (a -> b -> c) -> Parser m x a -> Parser m x b -> Parser m x c Source #

teeWith f p1 p2 distributes its input to both p1 and p2 until both of them succeed or anyone of them fails and combines their output using f. The parser succeeds if both the parsers succeed.

Pre-release

teeWithFst :: MonadCatch m => (a -> b -> c) -> Parser m x a -> Parser m x b -> Parser m x c Source #

Like teeWith but ends parsing and zips the results, if available, whenever the first parser ends.

Pre-release

teeWithMin :: MonadCatch m => (a -> b -> c) -> Parser m x a -> Parser m x b -> Parser m x c Source #

Like teeWith but ends parsing and zips the results, if available, whenever any of the parsers ends or fails.

Unimplemented

Sequential Interleaving

deintercalate :: Fold m a y -> Parser m x a -> Fold m b z -> Parser m x b -> Parser m x (y, z) Source #

Apply two parsers alternately to an input stream. The input stream is considered an interleaving of two patterns. The two parsers represent the two patterns.

This undoes a "gintercalate" of two streams.

Unimplemented

Sequential Alternative

alt :: MonadCatch m => Parser m x a -> Parser m x a -> Parser m x a Source #

Sequential alternative. Apply the input to the first parser and return the result if the parser succeeds. If the first parser fails then backtrack and apply the same input to the second parser and return the result.

Note: This implementation is not lazy in the second argument. The following will fail:

>>> Stream.parse (Parser.satisfy (> 0) `Parser.alt` undefined) $ Stream.fromList [1..10]
1

Compare with Alternative instance method <|>. This implementation allows stream fusion but has quadratic complexity. This can fuse with other operations and can be much faster than Alternative instance for small number (less than 8) of alternatives.

Pre-release

Parallel Alternatives

shortest :: MonadCatch m => Parser m x a -> Parser m x a -> Parser m x a Source #

Shortest alternative. Apply both parsers in parallel but choose the result from the one which consumed least input i.e. take the shortest succeeding parse.

Pre-release

longest :: MonadCatch m => Parser m x a -> Parser m x a -> Parser m x a Source #

Longest alternative. Apply both parsers in parallel but choose the result from the one which consumed more input i.e. take the longest succeeding parse.

Pre-release

N-ary Combinators

Sequential Collection

concatSequence :: Fold m b c -> t (Parser m a b) -> Parser m a c Source #

concatSequence f t collects sequential parses of parsers in the container t using the fold f. Fails if the input ends or any of the parsers fail.

This is same as sequence but more efficient.

Unimplemented

concatMap :: MonadCatch m => (b -> Parser m a c) -> Parser m a b -> Parser m a c Source #

Map a Parser returning function on the result of a Parser.

Compare with Monad instance method >>=. This implementation allows stream fusion but has quadratic complexity. This can fuse with other operations and can be much faster than Monad instance for small number (less than 8) of compositions.

Pre-release

Sequential Repetition

count :: Int -> Parser m a b -> Fold m b c -> Parser m a c Source #

count n f p collects exactly n sequential parses of parser p using the fold f. Fails if the input ends or the parser fails before n results are collected.

Unimplemented

countBetween :: Int -> Int -> Parser m a b -> Fold m b c -> Parser m a c Source #

countBetween m n f p collects between m and n sequential parses of parser p using the fold f. Stop after collecting n results. Fails if the input ends or the parser fails before m results are collected.

Unimplemented

manyP :: Parser m a b -> Parser m b c -> Parser m a c Source #

Like many but uses a Parser instead of a Fold to collect the results. Parsing stops or fails if the collecting parser stops or fails.

Unimplemented

many :: MonadCatch m => Parser m a b -> Fold m b c -> Parser m a c Source #

Collect zero or more parses. Apply the supplied parser repeatedly on the input stream and push the parse results to a downstream fold.

Stops: when the downstream fold stops or the parser fails. Fails: never, produces zero or more results.

Compare with many.

Pre-release

some :: MonadCatch m => Parser m a b -> Fold m b c -> Parser m a c Source #

Collect one or more parses. Apply the supplied parser repeatedly on the input stream and push the parse results to a downstream fold.

Stops: when the downstream fold stops or the parser fails. Fails: if it stops without producing a single result.

some fld parser = manyP (takeGE 1 fld) parser

Compare with some.

Pre-release

manyTillP :: Parser m a b -> Parser m a x -> Parser m b c -> Parser m a c Source #

Like manyTill but uses a Parser to collect the results instead of a Fold. Parsing stops or fails if the collecting parser stops or fails.

We can implemnent parsers like the following using manyTillP:

countBetweenTill m n f p = manyTillP (takeBetween m n f) p

Unimplemented

manyTill :: MonadCatch m => Parser m a b -> Parser m a x -> Fold m b c -> Parser m a c Source #

manyTill f collect test tries the parser test on the input, if test fails it backtracks and tries collect, after collect succeeds test is tried again and so on. The parser stops when test succeeds. The output of test is discarded and the output of collect is accumulated by the supplied fold. The parser fails if collect fails.

Stops when the fold f stops.

Pre-release

manyThen :: Parser m a b -> Parser m a x -> Fold m b c -> Parser m a c Source #

manyThen f collect recover repeats the parser collect on the input and collects the output in the supplied fold. If the the parser collect fails, parser recover is run until it stops and then we start repeating the parser collect again. The parser fails if the recovery parser fails.

For example, this can be used to find a key frame in a video stream after an error.

Unimplemented

Special cases

TODO: traditional implmentations of these may be of limited use. For example, consider parsing lines separated by \r\n. The main parser will have to detect and exclude the sequence \r\n anyway so that we can apply the "sep" parser.

We can instead implement these as special cases of deintercalate.

, endBy
, sepBy
, sepEndBy
, beginBy
, sepBeginBy
, sepAroundBy

Distribution

A simple and stupid impl would be to just convert the stream to an array and give the array reference to all consumers. The array can be grown on demand by any consumer and truncated when nonbody needs it.

Distribute to collection

Distribute to repetition

Interleaved collection

Round robin
Priority based

roundRobin :: t (Parser m a b) -> Fold m b c -> Parser m a c Source #

Apply a collection of parsers to an input stream in a round robin fashion. Each parser is applied until it stops and then we repeat starting with the the first parser again.

Unimplemented

Collection of Alternatives

Unimplemented

, shortestN
, longestN
, fastestN -- first N successful in time
, choiceN  -- first N successful in position

choice :: (Functor t, Foldable t, MonadCatch m) => t (Parser m a b) -> Parser m a b Source #

choice parsers applies the parsers in order and returns the first successful parse.

This is same as asum but more efficient.

Broken

Repeated Alternatives

retryMaxTotal :: Int -> Parser m a b -> Fold m b c -> Parser m a c Source #

Keep trying a parser up to a maximum of n failures. When the parser fails the input consumed till now is dropped and the new instance is tried on the fresh input.

Unimplemented

retryMaxSuccessive :: Int -> Parser m a b -> Fold m b c -> Parser m a c Source #

Like retryMaxTotal but aborts after n successive failures.

Unimplemented

retry :: Parser m a b -> Parser m a b Source #

Keep trying a parser until it succeeds. When the parser fails the input consumed till now is dropped and the new instance is tried on the fresh input.

Unimplemented