License	CC0-1.0
Maintainer	mordae@anilinux.org
Stability	unstable
Portability	non-portable (ghc)
Safe Haskell	None
Language	Haskell2010

Data.ByteString.Parser.Char8

Contents

Characters
Strings
Numbers
Combinators
End Of Input
Position
Miscelaneous

Description

This module provides a parser for ASCII ByteString.

If you'd like to parse Unicode text, look instead at the Data.Text.Parser. Is is slower, but in a way more correct.
If you'd like to parse byte sequences, look instead at the Data.ByteString.Parser. It reuses the same Parser, but provides functions working with Word8 instead of Char.

Synopsis

newtype Parser a = Parser {
- runParser :: ByteString -> Result a
}
data Result a
- = Success a !ByteString
- | Failure [String] !ByteString
- | Error String !ByteString !Int
parseOnly :: Parser a -> ByteString -> Either String a
char :: Char -> Parser Char
notChar :: Char -> Parser Char
anyChar :: Parser Char
satisfy :: (Char -> Bool) -> Parser Char
space :: Parser Char
isSpace :: Char -> Bool
skipSpace :: Parser ()
peekChar :: Parser Char
string :: ByteString -> Parser ByteString
stringCI :: ByteString -> Parser ByteString
take :: Int -> Parser ByteString
scan :: s -> (s -> Char -> Maybe s) -> Parser ByteString
runScanner :: s -> (s -> Char -> Maybe s) -> Parser (ByteString, s)
inRange :: Char -> Char -> Char -> Bool
notInRange :: Char -> Char -> Char -> Bool
takeWhile :: (Char -> Bool) -> Parser ByteString
takeWhile1 :: (Char -> Bool) -> Parser ByteString
takeTill :: (Char -> Bool) -> Parser ByteString
takeTill1 :: (Char -> Bool) -> Parser ByteString
signed :: Num a => Parser a -> Parser a
decimal :: Integral a => Parser a
hexadecimal :: Integral a => Parser a
octal :: Integral a => Parser a
fractional :: Fractional a => Parser a
provided :: (Alternative m, Monad m) => m a -> (a -> Bool) -> m a
choice :: Alternative f => [f a] -> f a
branch :: [(Parser a, a -> Parser b)] -> Parser b
count :: Monad m => Int -> m a -> m [a]
optional :: Alternative f => f a -> f (Maybe a)
eitherP :: Alternative f => f a -> f b -> f (Either a b)
option :: Alternative f => a -> f a -> f a
many :: Alternative f => f a -> f [a]
many1 :: Alternative f => f a -> f [a]
manyTill :: Alternative f => f a -> f a -> f [a]
sepBy :: Alternative f => f a -> f b -> f [a]
sepBy1 :: Alternative f => f a -> f b -> f [a]
wrap :: Applicative f => f a -> f b -> f a
match :: Parser a -> Parser (ByteString, a)
label :: String -> Parser a -> Parser a
unlabel :: Parser a -> Parser a
validate :: (a -> Either String b) -> Parser a -> Parser b
takeByteString :: Parser ByteString
peekByteString :: Parser ByteString
endOfInput :: Parser ()
atEnd :: Parser Bool
offset :: ByteString -> ByteString -> Int
position :: ByteString -> ByteString -> (Int, Int)
explain :: String -> ByteString -> Result a -> Explanation
data Explanation = Explanation {
- exSource :: String
- exSpanFrom :: (Int, Int)
- exSpanTo :: (Int, Int)
- exMessage :: String
}
empty :: Alternative f => f a
pure :: Applicative f => a -> f a
guard :: Alternative f => Bool -> f ()
when :: Applicative f => Bool -> f () -> f ()
unless :: Applicative f => Bool -> f () -> f ()
void :: Functor f => f a -> f ()

Documentation

newtype Parser a Source #

Parser for ByteString inputs.

Constructors

Parser
Fields runParser :: ByteString -> Result a Run the parser on specified input.

Instances

Instances details

Monad Parser Source #
Instance details Defined in Data.ByteString.Parser Methods (>>=) :: Parser a -> (a -> Parser b) -> Parser b # (>>) :: Parser a -> Parser b -> Parser b # return :: a -> Parser a #
Functor Parser Source #
Instance details Defined in Data.ByteString.Parser Methods fmap :: (a -> b) -> Parser a -> Parser b # (<$) :: a -> Parser b -> Parser a #
MonadFail Parser Source #
Instance details Defined in Data.ByteString.Parser Methods fail :: String -> Parser a #
Applicative Parser Source #
Instance details Defined in Data.ByteString.Parser Methods pure :: a -> Parser a # (<>) :: Parser (a -> b) -> Parser a -> Parser b # liftA2 :: (a -> b -> c) -> Parser a -> Parser b -> Parser c # (>) :: Parser a -> Parser b -> Parser b # (<*) :: Parser a -> Parser b -> Parser a #
Alternative Parser Source #
Instance details Defined in Data.ByteString.Parser Methods empty :: Parser a # (<\|>) :: Parser a -> Parser a -> Parser a # some :: Parser a -> Parser [a] # many :: Parser a -> Parser [a] #
MonadPlus Parser Source #
Instance details Defined in Data.ByteString.Parser Methods mzero :: Parser a # mplus :: Parser a -> Parser a -> Parser a #

data Result a Source #

Result represents either success or some kind of failure.

You can find the problematic offset by subtracting length of the remainder from length of the original input.

Constructors

Success a !ByteString	Parser successfully matched the input. Produces the parsing result and the remainder of the input.
Failure [String] !ByteString	Parser failed to match the input. Produces list of expected inputs and the corresponding remainder.
Error String !ByteString !Int	Parser ran into an error. Either syntactic or a validation one.

Instances

Instances details

Functor Result Source #
Instance details Defined in Data.ByteString.Parser Methods fmap :: (a -> b) -> Result a -> Result b # (<$) :: a -> Result b -> Result a #
Eq a => Eq (Result a) Source #
Instance details Defined in Data.ByteString.Parser Methods (==) :: Result a -> Result a -> Bool # (/=) :: Result a -> Result a -> Bool #
Show a => Show (Result a) Source #
Instance details Defined in Data.ByteString.Parser Methods showsPrec :: Int -> Result a -> ShowS # show :: Result a -> String # showList :: [Result a] -> ShowS #

parseOnly :: Parser a -> ByteString -> Either String a Source #

Discards the remaining input and returns just the parse result. You might want to combine it with endOfInput for the best effect.

Example:

parseOnly (pContacts <* endOfInput) bstr

Characters

char :: Char -> Parser Char Source #

Accepts a single, matching ASCII character.

notChar :: Char -> Parser Char Source #

Accepts a single, differing ASCII character.

anyChar :: Parser Char Source #

Accepts a single character.

satisfy :: (Char -> Bool) -> Parser Char Source #

Accepts a single character matching the predicate.

space :: Parser Char Source #

Accepts a single ASCII white space character. See isSpace for details.

isSpace :: Char -> Bool Source #

True for any of the [' ', '\t', '\n', '\v', '\f', '\r'] characters.

Please note that Data.Text.Parser re-exports isString, that considers more unicode codepoints, making it significantly slower.

skipSpace :: Parser () Source #

Accepts multiple ASCII white space characters. See isSpace for details.

peekChar :: Parser Char Source #

Peeks ahead, but does not consume.

Be careful, peeking behind end of the input fails. You might want to check using atEnd beforehand.

Strings

string :: ByteString -> Parser ByteString Source #

Accepts a matching string.

stringCI :: ByteString -> Parser ByteString Source #

Accepts a matching string. Matching is performed in a case-insensitive manner under ASCII.

take :: Int -> Parser ByteString Source #

Accepts given number of bytes. Fails when not enough bytes are available.

scan :: s -> (s -> Char -> Maybe s) -> Parser ByteString Source #

Scans ahead statefully and then accepts whatever bytes the scanner liked. Scanner returns Nothing to mark end of the acceptable extent.

runScanner :: s -> (s -> Char -> Maybe s) -> Parser (ByteString, s) Source #

Like scan, but also returns the final scanner state.

inRange :: Char -> Char -> Char -> Bool Source #

Tests whether the character lies within given range.

Definition:

inRange lo hi = c -> (lo <= c && c <= hi)

notInRange :: Char -> Char -> Char -> Bool Source #

Negation of inRange.

Definition:

notInRange lo hi = c -> (c <= lo || hi <= c)

takeWhile :: (Char -> Bool) -> Parser ByteString Source #

Efficiently consume as long as the input characters match the predicate. An inverse of takeTill.

takeWhile1 :: (Char -> Bool) -> Parser ByteString Source #

Like takeWhile, but requires at least a single character.

takeTill :: (Char -> Bool) -> Parser ByteString Source #

Efficiently consume until a character matching the predicate is found. An inverse of takeWhile.

takeTill1 :: (Char -> Bool) -> Parser ByteString Source #

Same as takeTill, but requires at least a single character.

Numbers

signed :: Num a => Parser a -> Parser a Source #

Accepts optional '+' or '-' character and then applies it to the following parser result.

decimal :: Integral a => Parser a Source #

Accepts an integral number in the decimal format.

hexadecimal :: Integral a => Parser a Source #

Accepts an integral number in the hexadecimal format in either case. Does not look for 0x or similar prefixes.

octal :: Integral a => Parser a Source #

Accepts an integral number in the octal format.

fractional :: Fractional a => Parser a Source #

Accepts a fractional number as a decimal optinally followed by a colon and the fractional part. Does not support exponentiation.

Combinators

provided :: (Alternative m, Monad m) => m a -> (a -> Bool) -> m a Source #

Fails if the value returned by the parser does not conform to the predicate. Generalized form of string.

Example:

pInput = takeWhile isLetter `provided` (odd . length)

choice :: Alternative f => [f a] -> f a Source #

Tries various parsers, one by one.

Example:

pExpression = choice [ pConstant
                     , pVariable
                     , pBinaryOperation
                     , pFunctionApplication
                     ]

branch :: [(Parser a, a -> Parser b)] -> Parser b Source #

Given list of matchers and parsers, runs the first parser whose matcher succeeds on the input. This pattern makes for a simpler alternative to try used in other parser combinator libraries.

Example:

pProperty = branch [ ( string "public" <* skipSpace
                     , _ -> Property Public $ pToken
                     )
                   , ( string "private" <* skipSpace
                     , _ -> Property Private $ pToken
                     )
                   ]

count :: Monad m => Int -> m a -> m [a] Source #

Replicates the parser given number of times, collecting the results in a list. Fails if any instance of the parser fails.

Example:

pFourWords = (:) <$> word <*> count 3 (blank *> word)
  where word  = takeWhile1 isLetter
        blank = takeWhile1 isSpace

optional :: Alternative f => f a -> f (Maybe a) #

One or none.

eitherP :: Alternative f => f a -> f b -> f (Either a b) Source #

Captures first parser as Left or the second as Right.

option :: Alternative f => a -> f a -> f a Source #

Shortcut for optional with a default value.

Example:

data Contact =
 Contact
   { contactName  :: Text
   , contactEmail :: Maybe Text
   }

pContact = Contact <$> pFullName <*> option pEmail

many :: Alternative f => f a -> f [a] #

Zero or more.

many1 :: Alternative f => f a -> f [a] Source #

Like many1, but requires at least one match.

manyTill :: Alternative f => f a -> f a -> f [a] Source #

Like many, but stops once the second parser matches the input ahead.

Example:

pBodyLines = pLine `manyTill` pEnd
  where pLine = takeTill (== 'n')
        pEnd  = string "n.n"

sepBy :: Alternative f => f a -> f b -> f [a] Source #

Similar to many, but interleaves the first parser with the second.

Example:

pLines = pLine sepBy char 'n'

sepBy1 :: Alternative f => f a -> f b -> f [a] Source #

Like sepBy, but requires at least one match.

wrap :: Applicative f => f a -> f b -> f a Source #

Wraps the parser from both sides.

Example:

pToken = takeWhile1 (inClass "A-Za-z0-9_") `wrap` takeWhile isSpace

match :: Parser a -> Parser (ByteString, a) Source #

Makes the parser not only return the result, but also the original matched extent.

label :: String -> Parser a -> Parser a Source #

Names an extent of the parser.

When the extent returns a Failure, details are discarded and replaced with the extent as a whole.

When the extent returns an Error, it is adjusted to cover the whole extent, but the reason is left intact.

You should strive to make labeled extents as small as possible, approximately of a typical token size. For example:

pString = label "string" $ pStringContents `wrap` char '"'

unlabel :: Parser a -> Parser a Source #

Un-names an extent of the parser.

Same as label, but removes any expected values upon Failure. Very useful to mark comments and optional whitespace with.

validate :: (a -> Either String b) -> Parser a -> Parser b Source #

Validate parser result and turn it into an Error upon failure.

End Of Input

takeByteString :: Parser ByteString Source #

Accept whatever input remains.

peekByteString :: Parser ByteString Source #

Peek at whatever input remains.

endOfInput :: Parser () Source #

Accepts end of input and fails if we are not there yet.

atEnd :: Parser Bool Source #

Returns whether we are at the end of the input yet.

Position

offset :: ByteString -> ByteString -> Int Source #

Calculate offset from the original input and the remainder.

position :: ByteString -> ByteString -> (Int, Int) Source #

Determine (line, column) from the original input and the remainder.

Counts line feed characters leading to the offset, so only use it on your slow path. For example when describing parsing errors.

explain :: String -> ByteString -> Result a -> Explanation Source #

Process the result for showing it to the user.

data Explanation Source #

More precise Result description produced by explain.

Constructors

Explanation
Fields exSource :: String Name of the source file. exSpanFrom :: (Int, Int) Line and column where the problem starts. exSpanTo :: (Int, Int) Line and column where the problem ends. exMessage :: String Message associated with the problem.

Instances

Instances details

Eq Explanation Source #
Instance details Defined in Data.ByteString.Parser.Char8 Methods (==) :: Explanation -> Explanation -> Bool # (/=) :: Explanation -> Explanation -> Bool #
Show Explanation Source #
Instance details Defined in Data.ByteString.Parser.Char8 Methods showsPrec :: Int -> Explanation -> ShowS # show :: Explanation -> String # showList :: [Explanation] -> ShowS #

Miscelaneous

These are all generic methods, but since I sometimes forget about them, it is nice to have them listed here for reference what writing parsers.

empty :: Alternative f => f a #

The identity of <|>

pure :: Applicative f => a -> f a #

Lift a value.

guard :: Alternative f => Bool -> f () #

Conditional failure of Alternative computations. Defined by

guard True  = pure ()
guard False = empty

Examples

Expand

Common uses of guard include conditionally signaling an error in an error monad and conditionally rejecting the current choice in an Alternative-based parser.

As an example of signaling an error in the error monad Maybe, consider a safe division function safeDiv x y that returns Nothing when the denominator y is zero and Just (x `div` y) otherwise. For example:

>>> safeDiv 4 0
Nothing
>>> safeDiv 4 2
Just 2

A definition of safeDiv using guards, but not guard:

safeDiv :: Int -> Int -> Maybe Int
safeDiv x y | y /= 0    = Just (x `div` y)
            | otherwise = Nothing

A definition of safeDiv using guard and Monad do-notation:

safeDiv :: Int -> Int -> Maybe Int
safeDiv x y = do
  guard (y /= 0)
  return (x `div` y)

when :: Applicative f => Bool -> f () -> f () #

Conditional execution of Applicative expressions. For example,

when debug (putStrLn "Debugging")

will output the string Debugging if the Boolean value debug is True, and otherwise do nothing.

unless :: Applicative f => Bool -> f () -> f () #

The reverse of when.

void :: Functor f => f a -> f () #

void value discards or ignores the result of evaluation, such as the return value of an IO action.

Using ApplicativeDo: 'void as' can be understood as the do expression

do as
   pure ()

with an inferred Functor constraint.

Examples

Expand

Replace the contents of a Maybe Int with unit:

>>> void Nothing
Nothing
>>> void (Just 3)
Just ()

Replace the contents of an Either Int Int with unit, resulting in an Either Int ():

>>> void (Left 8675309)
Left 8675309
>>> void (Right 8675309)
Right ()

Replace every element of a list with unit:

>>> void [1,2,3]
[(),(),()]

Replace the second element of a pair with unit:

>>> void (1,2)
(1,())

Discard the result of an IO action:

>>> mapM print [1,2]
1
2
[(),()]
>>> void $ mapM print [1,2]
1
2