Copyright	Daan Leijen 1999-2001, Bryan O'Sullivan 2007-2015, Winterland 2016
License	BSD3
Maintainer	drkoster@qq.com
Stability	experimental
Portability	portable
Safe Haskell	None
Language	Haskell2010

Data.Binary.Parser

Contents

Running parsers
Decoder conversion
Combinators
Re-exports

Description

This library provide parsec/attoparsec style parsing combinators for binary package. By default, this module export combinators in Data.Binary.Get, Data.Binary.Parser.Word8 and Data.Binary.Parser.Numeric, for additional ASCII char parser, please check Data.Binary.Parser.Char8 module.

The behaviour of parsers here is different to that of the similarly-named parser in Parsec, as this one is all-or-nothing. To illustrate the difference, the following parser will fail under Parsec given an input of "for":

string "foo" <|> string "for"

The reason for its failure is that the first branch is a partial match, and will consume the letters 'f' and 'o' before failing. In binary-parsers, the above parser will succeed on that input, because the failed first branch will consume nothing.

There're some redundant combinators get removed, for example:

choice == asum
count == replicateM
atEnd == isEmpty
take == getByteString
many1 == some

For fast byte set operations, please use charset package.

It's recommanded to use parseOnly, parseDetail... functions to run your parsers since these functions are faster than binary's counter part by avoiding a small constant overhead. Check parse for detail.

A few words on performance and backtracking

There's a common belief that parsers which support backtracking are slow, but it's not neccessarily true in binary, because binary doesn't do book keeping if you doesn't use <|>, lookAhead or their friends. Combinators in this library like peek, string... also try to avoid backtracking so it's faster to use them rather than do backtracking yourself, for example, peek is faster than lookAhead getWord8. In practice, protocols are often designed to avoid backtracking. For example, if you have following parser:

branch1 <|> branch2 <|> (skipN 1 >> branch3)

And if you can select the right branch just by looking ahead one byte, then you can rewrite it to:

w <- peek
if  | w == b1 -> branch1
    | w == b2 -> branch2
    | w == b3 -> skipN 1 >> branch3

Binary performs as fast as a non-backtracking parser as long as you construct your parser without using backtracking. And sometime backtracking is indeed neccessary, for example scientifically is almost impossible to implement correctly if you don't do backtracking.

Synopsis

Running parsers

type Parser a = Get a Source #

Alias to Get for attoparsec compatibility.

parseOnly :: Get a -> ByteString -> Either String a Source #

Run a parser on ByteString.

This function does not force a parser to consume all of its input. Instead, any residual input will be discarded. To force a parser to consume all of its input, use something like this:

parseOnly (myParser <* endOfInput)

parseLazy :: Get a -> ByteString -> Either String a Source #

Similar to parseOnly, but run a parser on lazy ByteString.

parseDetail :: Get a -> ByteString -> Either (ByteString, ByteOffset, String) (ByteString, ByteOffset, a) Source #

Run a parser on ByteString.

This function return full parsing results: the rest of input, stop offest and fail message or parsing result.

Since: 0.2.1.0

parseDetailLazy :: Get a -> ByteString -> Either (ByteString, ByteOffset, String) (ByteString, ByteOffset, a) Source #

Similar to parseDetail, but run a parser on lazy ByteString.

Since: 0.2.1.0

parse :: Get a -> ByteString -> Decoder a Source #

Run a Get monad. See Decoder for what to do next, like providing input, handling decoding errors and to get the output value.

This's faster than runGetIncremental becuase it provides an initial chunk rather than feeding empty and waiting for chunks, this overhead is noticeable when you're running small getters over short ByteString s.

Since: 0.2.1.0

Decoder conversion

maybeDecoder :: Decoder r -> Maybe r Source #

Convert a Decoder value to a Maybe value. A Partial result is treated as failure.

Since: 0.2.3.0

eitherDecoder :: Decoder r -> Either String r Source #

Convert a Decoder value to an Either value. A Partial result is treated as failure.

Since: 0.2.3.0

Combinators

(<?>) :: Get a -> String -> Get a infix 0 Source #

Name the parser, in case failure occurs.

endOfInput :: Get () Source #

Match only if all input has been consumed.

option :: Alternative f => a -> f a -> f a Source #

option x p tries to apply action p. If p fails without consuming input, it returns the value x, otherwise the value returned by p.

priority  = option 0 (digitToInt <$> digit)

eitherP :: Alternative f => f a -> f b -> f (Either a b) Source #

Combine two alternatives.

match :: Get a -> Get (ByteString, a) Source #

Return both the result of a parse and the portion of the input that was consumed while it was being parsed.

many' :: MonadPlus m => m a -> m [a] Source #

many' p applies the action p zero or more times. Returns a list of the returned values of p. The value returned by p is forced to WHNF.

 word  = many' letter

some' :: MonadPlus m => m a -> m [a] Source #

some' p applies the action p one or more times. Returns a list of the returned values of p. The value returned by p is forced to WHNF.

 word  = some' letter

sepBy :: Alternative f => f a -> f s -> f [a] Source #

sepBy p sep applies zero or more occurrences of p, separated by sep. Returns a list of the values returned by p.

commaSep p  = p `sepBy` (char ',')

sepBy' :: MonadPlus m => m a -> m s -> m [a] Source #

sepBy' p sep applies zero or more occurrences of p, separated by sep. Returns a list of the values returned by p. The value returned by p is forced to WHNF.

commaSep p  = p `sepBy'` (char ',')

sepBy1 :: Alternative f => f a -> f s -> f [a] Source #

sepBy1 p sep applies one or more occurrences of p, separated by sep. Returns a list of the values returned by p.

commaSep p  = p `sepBy1` (char ',')

sepBy1' :: MonadPlus m => m a -> m s -> m [a] Source #

sepBy1' p sep applies one or more occurrences of p, separated by sep. Returns a list of the values returned by p. The value returned by p is forced to WHNF.

commaSep p  = p `sepBy1'` (char ',')

manyTill :: Alternative f => f a -> f b -> f [a] Source #

manyTill p end applies action p zero or more times until action end succeeds, and returns the list of values returned by p. This can be used to scan comments:

 simpleComment   = string "<!--" *> manyTill anyChar (string "-->")

(Note the overlapping parsers anyChar and string "-->". While this will work, it is not very efficient, as it will cause a lot of backtracking.)

manyTill' :: MonadPlus m => m a -> m b -> m [a] Source #

manyTill' p end applies action p zero or more times until action end succeeds, and returns the list of values returned by p. This can be used to scan comments:

 simpleComment   = string "<!--" *> manyTill' anyChar (string "-->")

(Note the overlapping parsers anyChar and string "-->". While this will work, it is not very efficient, as it will cause a lot of backtracking.)

The value returned by p is forced to WHNF.

skipMany :: Alternative f => f a -> f () Source #

Skip zero or more instances of an action.

skipMany1 :: Alternative f => f a -> f () Source #

Skip one or more instances of an action.

Re-exports

module Data.Binary.Get

module Data.Binary.Parser.Word8

module Data.Binary.Parser.Numeric