License | CC0-1.0 |
---|---|
Maintainer | mordae@anilinux.org |
Stability | unstable |
Portability | non-portable (ghc) |
Safe Haskell | None |
Language | Haskell2010 |
This module provides a parser for ByteString
.
- If you'd like to parse ASCII text, you might want to take a look at
Data.ByteString.Parser.Char8. It reuses the same
Parser
, but provides functions working withChar
instead ofWord8
as well as more string utilities. - If you'd like to parse Unicode text, look instead at the Data.Text.Parser. Is is slower, but in a way more correct.
Synopsis
- newtype Parser a = Parser {
- runParser :: ByteString -> Result a
- data Result a
- = Success a !ByteString
- | Failure [String] !ByteString
- | Error String !ByteString !Int
- parseOnly :: Parser a -> ByteString -> Either String a
- byte :: Word8 -> Parser Word8
- notByte :: Word8 -> Parser Word8
- anyByte :: Parser Word8
- satisfy :: (Word8 -> Bool) -> Parser Word8
- peekByte :: Parser Word8
- string :: ByteString -> Parser ByteString
- take :: Int -> Parser ByteString
- scan :: s -> (s -> Word8 -> Maybe s) -> Parser ByteString
- runScanner :: s -> (s -> Word8 -> Maybe s) -> Parser (ByteString, s)
- takeWhile :: (Word8 -> Bool) -> Parser ByteString
- takeWhile1 :: (Word8 -> Bool) -> Parser ByteString
- takeTill :: (Word8 -> Bool) -> Parser ByteString
- takeTill1 :: (Word8 -> Bool) -> Parser ByteString
- provided :: (Alternative m, Monad m) => m a -> (a -> Bool) -> m a
- choice :: Alternative f => [f a] -> f a
- branch :: [(Parser a, a -> Parser b)] -> Parser b
- count :: Monad m => Int -> m a -> m [a]
- optional :: Alternative f => f a -> f (Maybe a)
- eitherP :: Alternative f => f a -> f b -> f (Either a b)
- option :: Alternative f => a -> f a -> f a
- many :: Alternative f => f a -> f [a]
- many1 :: Alternative f => f a -> f [a]
- manyTill :: Alternative f => f a -> f a -> f [a]
- sepBy :: Alternative f => f a -> f b -> f [a]
- sepBy1 :: Alternative f => f a -> f b -> f [a]
- wrap :: Applicative f => f a -> f b -> f a
- match :: Parser a -> Parser (ByteString, a)
- label :: String -> Parser a -> Parser a
- unlabel :: Parser a -> Parser a
- validate :: (a -> Either String b) -> Parser a -> Parser b
- takeByteString :: Parser ByteString
- peekByteString :: Parser ByteString
- endOfInput :: Parser ()
- atEnd :: Parser Bool
- offset :: ByteString -> ByteString -> Int
- empty :: Alternative f => f a
- pure :: Applicative f => a -> f a
- guard :: Alternative f => Bool -> f ()
- when :: Applicative f => Bool -> f () -> f ()
- unless :: Applicative f => Bool -> f () -> f ()
- void :: Functor f => f a -> f ()
Documentation
Parser for ByteString
inputs.
Parser | |
|
Result represents either success or some kind of failure.
You can find the problematic offset by subtracting length of the remainder from length of the original input.
Success a !ByteString | Parser successfully matched the input. Produces the parsing result and the remainder of the input. |
Failure [String] !ByteString | Parser failed to match the input. Produces list of expected inputs and the corresponding remainder. |
Error String !ByteString !Int | Parser ran into an error. Either syntactic or a validation one. |
parseOnly :: Parser a -> ByteString -> Either String a Source #
Discards the remaining input and returns just the parse result.
You might want to combine it with endOfInput
for the best effect.
Example:
parseOnly (pContacts <* endOfInput) bstr
Bytes
peekByte :: Parser Word8 Source #
Peeks ahead, but does not consume.
Be careful, peeking behind end of the input fails.
You might want to check using atEnd
beforehand.
Strings
string :: ByteString -> Parser ByteString Source #
Accepts a matching string.
take :: Int -> Parser ByteString Source #
Accepts given number of bytes. Fails when not enough bytes are available.
scan :: s -> (s -> Word8 -> Maybe s) -> Parser ByteString Source #
Scans ahead statefully and then accepts whatever bytes the scanner liked.
Scanner returns Nothing
to mark end of the acceptable extent.
runScanner :: s -> (s -> Word8 -> Maybe s) -> Parser (ByteString, s) Source #
Like scan
, but also returns the final scanner state.
takeWhile :: (Word8 -> Bool) -> Parser ByteString Source #
Efficiently consume as long as the input bytes match the predicate.
An inverse of takeTill
.
takeWhile1 :: (Word8 -> Bool) -> Parser ByteString Source #
Like takeWhile
, but requires at least a single byte.
takeTill :: (Word8 -> Bool) -> Parser ByteString Source #
Efficiently consume until a byte matching the predicate is found.
An inverse of takeWhile
.
takeTill1 :: (Word8 -> Bool) -> Parser ByteString Source #
Same as takeTill
, but requires at least a single byte.
Combinators
provided :: (Alternative m, Monad m) => m a -> (a -> Bool) -> m a Source #
Fails if the value returned by the parser does not conform to the
predicate. Generalized form of string
.
Example:
pInput = takeWhile isLetter `provided` (odd . length)
choice :: Alternative f => [f a] -> f a Source #
Tries various parsers, one by one.
Example:
pExpression = choice [ pConstant , pVariable , pBinaryOperation , pFunctionApplication ]
branch :: [(Parser a, a -> Parser b)] -> Parser b Source #
Given list of matchers and parsers, runs the first parser whose matcher
succeeds on the input. This pattern makes for a simpler alternative to
try
used in other parser combinator libraries.
Example:
pProperty = branch [ ( string "public" <* skipSpace , _ -> Property Public $ pToken ) , ( string "private" <* skipSpace , _ -> Property Private $ pToken ) ]
count :: Monad m => Int -> m a -> m [a] Source #
Replicates the parser given number of times, collecting the results in a list. Fails if any instance of the parser fails.
Example:
pFourWords = (:) <$> word <*> count 3 (blank *> word) where word = takeWhile1 isLetter blank = takeWhile1 isSpace
optional :: Alternative f => f a -> f (Maybe a) #
One or none.
eitherP :: Alternative f => f a -> f b -> f (Either a b) Source #
Captures first parser as Left
or the second as Right
.
option :: Alternative f => a -> f a -> f a Source #
Shortcut for optional
with a default value.
Example:
data Contact = Contact { contactName :: Text , contactEmail :: Maybe Text } pContact = Contact <$> pFullName <*> option pEmail
many :: Alternative f => f a -> f [a] #
Zero or more.
many1 :: Alternative f => f a -> f [a] Source #
Like many1
, but requires at least one match.
manyTill :: Alternative f => f a -> f a -> f [a] Source #
Like many
, but stops once the second parser matches the input ahead.
Example:
pBodyLines = pLine `manyTill` pEnd where pLine = takeTill (== 'n') pEnd = string "n.n"
sepBy :: Alternative f => f a -> f b -> f [a] Source #
sepBy1 :: Alternative f => f a -> f b -> f [a] Source #
Like sepBy
, but requires at least one match.
wrap :: Applicative f => f a -> f b -> f a Source #
Wraps the parser from both sides.
Example:
pToken = takeWhile1 (inClass "A-Za-z0-9_") `wrap` takeWhile isSpace
match :: Parser a -> Parser (ByteString, a) Source #
Makes the parser not only return the result, but also the original matched extent.
label :: String -> Parser a -> Parser a Source #
Names an extent of the parser.
When the extent returns a Failure, details are discarded and replaced with the extent as a whole.
When the extent returns an Error, it is adjusted to cover the whole extent, but the reason is left intact.
You should strive to make labeled extents as small as possible, approximately of a typical token size. For example:
pString = label "string" $ pStringContents `wrap` char '"'
unlabel :: Parser a -> Parser a Source #
Un-names an extent of the parser.
Same as label
, but removes any expected values upon Failure.
Very useful to mark comments and optional whitespace with.
validate :: (a -> Either String b) -> Parser a -> Parser b Source #
Validate parser result and turn it into an Error upon failure.
End Of Input
takeByteString :: Parser ByteString Source #
Accept whatever input remains.
peekByteString :: Parser ByteString Source #
Peek at whatever input remains.
endOfInput :: Parser () Source #
Accepts end of input and fails if we are not there yet.
Position
offset :: ByteString -> ByteString -> Int Source #
Calculate offset from the original input and the remainder.
Miscelaneous
These are all generic methods, but since I sometimes forget about them, it is nice to have them listed here for reference what writing parsers.
empty :: Alternative f => f a #
The identity of <|>
pure :: Applicative f => a -> f a #
Lift a value.
guard :: Alternative f => Bool -> f () #
Conditional failure of Alternative
computations. Defined by
guard True =pure
() guard False =empty
Examples
Common uses of guard
include conditionally signaling an error in
an error monad and conditionally rejecting the current choice in an
Alternative
-based parser.
As an example of signaling an error in the error monad Maybe
,
consider a safe division function safeDiv x y
that returns
Nothing
when the denominator y
is zero and
otherwise. For example:Just
(x `div`
y)
>>> safeDiv 4 0 Nothing >>> safeDiv 4 2 Just 2
A definition of safeDiv
using guards, but not guard
:
safeDiv :: Int -> Int -> Maybe Int safeDiv x y | y /= 0 = Just (x `div` y) | otherwise = Nothing
A definition of safeDiv
using guard
and Monad
do
-notation:
safeDiv :: Int -> Int -> Maybe Int safeDiv x y = do guard (y /= 0) return (x `div` y)
when :: Applicative f => Bool -> f () -> f () #
Conditional execution of Applicative
expressions. For example,
when debug (putStrLn "Debugging")
will output the string Debugging
if the Boolean value debug
is True
, and otherwise do nothing.
unless :: Applicative f => Bool -> f () -> f () #
The reverse of when
.
void :: Functor f => f a -> f () #
discards or ignores the result of evaluation, such
as the return value of an void
valueIO
action.
Using ApplicativeDo
: '
' can be understood as the
void
asdo
expression
do as pure ()
with an inferred Functor
constraint.
Examples
Replace the contents of a
with unit:Maybe
Int
>>>
void Nothing
Nothing>>>
void (Just 3)
Just ()
Replace the contents of an
with unit, resulting in an Either
Int
Int
:Either
Int
()
>>>
void (Left 8675309)
Left 8675309>>>
void (Right 8675309)
Right ()
Replace every element of a list with unit:
>>>
void [1,2,3]
[(),(),()]
Replace the second element of a pair with unit:
>>>
void (1,2)
(1,())
Discard the result of an IO
action:
>>>
mapM print [1,2]
1 2 [(),()]>>>
void $ mapM print [1,2]
1 2