License | CC0-1.0 |
---|---|
Maintainer | mordae@anilinux.org |
Stability | unstable |
Portability | non-portable (ghc) |
Safe Haskell | None |
Language | Haskell2010 |
This module provides a parser for unicode Text
.
- If you'd like to parse ASCII text, you might want to take a look at Data.ByteString.Parser.Char8. It is much, much faster.
- If you'd like to parse byte sequences, look instead at the Data.ByteString.Parser.
Synopsis
- newtype Parser a = Parser {}
- data Result a
- parseOnly :: Parser a -> Text -> Either String a
- char :: Char -> Parser Char
- notChar :: Char -> Parser Char
- anyChar :: Parser Char
- satisfy :: (Char -> Bool) -> Parser Char
- space :: Parser Char
- isSpace :: Char -> Bool
- skipSpace :: Parser ()
- peekChar :: Parser Char
- string :: Text -> Parser Text
- stringCI :: Text -> Parser Text
- take :: Int -> Parser Text
- scan :: s -> (s -> Char -> Maybe s) -> Parser Text
- runScanner :: s -> (s -> Char -> Maybe s) -> Parser (Text, s)
- takeWhile :: (Char -> Bool) -> Parser Text
- takeWhile1 :: (Char -> Bool) -> Parser Text
- takeTill :: (Char -> Bool) -> Parser Text
- takeTill1 :: (Char -> Bool) -> Parser Text
- signed :: Num a => Parser a -> Parser a
- decimal :: Integral a => Parser a
- hexadecimal :: Integral a => Parser a
- octal :: Integral a => Parser a
- fractional :: Fractional a => Parser a
- provided :: (Alternative m, Monad m) => m a -> (a -> Bool) -> m a
- choice :: Alternative f => [f a] -> f a
- branch :: [(Parser a, a -> Parser b)] -> Parser b
- count :: Monad m => Int -> m a -> m [a]
- optional :: Alternative f => f a -> f (Maybe a)
- eitherP :: Alternative f => f a -> f b -> f (Either a b)
- option :: Alternative f => a -> f a -> f a
- many :: Alternative f => f a -> f [a]
- many1 :: Alternative f => f a -> f [a]
- manyTill :: Alternative f => f a -> f a -> f [a]
- sepBy :: Alternative f => f a -> f b -> f [a]
- sepBy1 :: Alternative f => f a -> f b -> f [a]
- wrap :: Applicative f => f a -> f b -> f a
- match :: Parser a -> Parser (Text, a)
- label :: String -> Parser a -> Parser a
- unlabel :: Parser a -> Parser a
- validate :: (a -> Either String b) -> Parser a -> Parser b
- takeText :: Parser Text
- peekText :: Parser Text
- endOfInput :: Parser ()
- atEnd :: Parser Bool
- offset :: Text -> Text -> Int
- position :: Text -> Text -> (Int, Int)
- explain :: String -> Text -> Result a -> Explanation
- data Explanation = Explanation {}
- empty :: Alternative f => f a
- pure :: Applicative f => a -> f a
- guard :: Alternative f => Bool -> f ()
- when :: Applicative f => Bool -> f () -> f ()
- unless :: Applicative f => Bool -> f () -> f ()
- void :: Functor f => f a -> f ()
Documentation
Parser for Text
inputs.
Result represents either success or some kind of failure.
You can find the problematic offset by subtracting length of the remainder from length of the original input.
Success a !Text | Parser successfully matched the input. Produces the parsing result and the remainder of the input. |
Failure [String] !Text | Parser failed to match the input. Produces list of expected inputs and the corresponding remainder. |
Error String !Text !Int | Parser ran into an error. Either syntactic or a validation one. |
parseOnly :: Parser a -> Text -> Either String a Source #
Discards the remaining input and returns just the parse result.
You might want to combine it with endOfInput
for the best effect.
Example:
parseOnly (pContacts <* endOfInput) bstr
Chars
Accepts a single unicode white space character.
See isSpace
for details.
Returns True
for any Unicode space character, and the control
characters \t
, \n
, \r
, \f
, \v
.
skipSpace :: Parser () Source #
Accepts multiple unicode white space characters.
See isSpace
for details.
peekChar :: Parser Char Source #
Peeks ahead, but does not consume.
Be careful, peeking behind end of the input fails.
You might want to check using atEnd
beforehand.
Strings
take :: Int -> Parser Text Source #
Accepts given number of characters. Fails when not enough characters are available.
scan :: s -> (s -> Char -> Maybe s) -> Parser Text Source #
Scans ahead statefully and then accepts whatever characters the scanner liked.
Scanner returns Nothing
to mark end of the acceptable extent.
runScanner :: s -> (s -> Char -> Maybe s) -> Parser (Text, s) Source #
Like scan
, but also returns the final scanner state.
takeWhile :: (Char -> Bool) -> Parser Text Source #
Efficiently consume as long as the input characters match the predicate.
An inverse of takeTill
.
takeWhile1 :: (Char -> Bool) -> Parser Text Source #
Like takeWhile
, but requires at least a single character.
takeTill :: (Char -> Bool) -> Parser Text Source #
Efficiently consume until a character matching the predicate is found.
An inverse of takeWhile
.
takeTill1 :: (Char -> Bool) -> Parser Text Source #
Same as takeTill
, but requires at least a single character.
Numbers
signed :: Num a => Parser a -> Parser a Source #
Accepts optional '+'
or '-'
character and then applies it to
the following parser result.
hexadecimal :: Integral a => Parser a Source #
Accepts an integral number in the hexadecimal format in either case.
Does not look for 0x
or similar prefixes.
fractional :: Fractional a => Parser a Source #
Accepts a fractional number as a decimal optinally followed by a colon and the fractional part. Does not support exponentiation.
Combinators
provided :: (Alternative m, Monad m) => m a -> (a -> Bool) -> m a Source #
Fails if the value returned by the parser does not conform to the
predicate. Generalized form of string
.
Example:
pInput = takeWhile isLetter `provided` (odd . length)
choice :: Alternative f => [f a] -> f a Source #
Tries various parsers, one by one.
Example:
pExpression = choice [ pConstant , pVariable , pBinaryOperation , pFunctionApplication ]
branch :: [(Parser a, a -> Parser b)] -> Parser b Source #
Given list of matchers and parsers, runs the first parser whose matcher
succeeds on the input. This pattern makes for a simpler alternative to
try
used in other parser combinator libraries.
Example:
pProperty = branch [ ( string "public" <* skipSpace , _ -> Property Public $ pToken ) , ( string "private" <* skipSpace , _ -> Property Private $ pToken ) ]
count :: Monad m => Int -> m a -> m [a] Source #
Replicates the parser given number of times, collecting the results in a list. Fails if any instance of the parser fails.
Example:
pFourWords = (:) <$> word <*> count 3 (blank *> word) where word = takeWhile1 isLetter blank = takeWhile1 isSpace
optional :: Alternative f => f a -> f (Maybe a) #
One or none.
eitherP :: Alternative f => f a -> f b -> f (Either a b) Source #
Captures first parser as Left
or the second as Right
.
option :: Alternative f => a -> f a -> f a Source #
Shortcut for optional
with a default value.
Example:
data Contact = Contact { contactName :: Text , contactEmail :: Maybe Text } pContact = Contact <$> pFullName <*> option pEmail
many :: Alternative f => f a -> f [a] #
Zero or more.
many1 :: Alternative f => f a -> f [a] Source #
Like many1
, but requires at least one match.
manyTill :: Alternative f => f a -> f a -> f [a] Source #
Like many
, but stops once the second parser matches the input ahead.
Example:
pBodyLines = pLine `manyTill` pEnd where pLine = takeTill (== 'n') pEnd = string "n.n"
sepBy :: Alternative f => f a -> f b -> f [a] Source #
sepBy1 :: Alternative f => f a -> f b -> f [a] Source #
Like sepBy
, but requires at least one match.
wrap :: Applicative f => f a -> f b -> f a Source #
Wraps the parser from both sides.
Example:
pToken = takeWhile1 (inClass "A-Za-z0-9_") `wrap` takeWhile isSpace
match :: Parser a -> Parser (Text, a) Source #
Makes the parser not only return the result, but also the original matched extent.
label :: String -> Parser a -> Parser a Source #
Names an extent of the parser.
When the extent returns a Failure, details are discarded and replaced with the extent as a whole.
When the extent returns an Error, it is adjusted to cover the whole extent, but the reason is left intact.
You should strive to make labeled extents as small as possible, approximately of a typical token size. For example:
pString = label "string" $ pStringContents `wrap` char '"'
unlabel :: Parser a -> Parser a Source #
Un-names an extent of the parser.
Same as label
, but removes any expected values upon Failure.
Very useful to mark comments and optional whitespace with.
validate :: (a -> Either String b) -> Parser a -> Parser b Source #
Validate parser result and turn it into an Error upon failure.
End Of Input
endOfInput :: Parser () Source #
Accepts end of input and fails if we are not there yet.
Position
position :: Text -> Text -> (Int, Int) Source #
Determine (line, column)
from the original input and the remainder.
Counts line feed characters leading to the offset
, so only use it
on your slow path. For example when describing parsing errors.
explain :: String -> Text -> Result a -> Explanation Source #
Process the result for showing it to the user.
data Explanation Source #
Instances
Eq Explanation Source # | |
Defined in Data.Text.Parser (==) :: Explanation -> Explanation -> Bool # (/=) :: Explanation -> Explanation -> Bool # | |
Show Explanation Source # | |
Defined in Data.Text.Parser showsPrec :: Int -> Explanation -> ShowS # show :: Explanation -> String # showList :: [Explanation] -> ShowS # |
Miscelaneous
These are all generic methods, but since I sometimes forget about them, it is nice to have them listed here for reference what writing parsers.
empty :: Alternative f => f a #
The identity of <|>
pure :: Applicative f => a -> f a #
Lift a value.
guard :: Alternative f => Bool -> f () #
Conditional failure of Alternative
computations. Defined by
guard True =pure
() guard False =empty
Examples
Common uses of guard
include conditionally signaling an error in
an error monad and conditionally rejecting the current choice in an
Alternative
-based parser.
As an example of signaling an error in the error monad Maybe
,
consider a safe division function safeDiv x y
that returns
Nothing
when the denominator y
is zero and
otherwise. For example:Just
(x `div`
y)
>>> safeDiv 4 0 Nothing >>> safeDiv 4 2 Just 2
A definition of safeDiv
using guards, but not guard
:
safeDiv :: Int -> Int -> Maybe Int safeDiv x y | y /= 0 = Just (x `div` y) | otherwise = Nothing
A definition of safeDiv
using guard
and Monad
do
-notation:
safeDiv :: Int -> Int -> Maybe Int safeDiv x y = do guard (y /= 0) return (x `div` y)
when :: Applicative f => Bool -> f () -> f () #
Conditional execution of Applicative
expressions. For example,
when debug (putStrLn "Debugging")
will output the string Debugging
if the Boolean value debug
is True
, and otherwise do nothing.
unless :: Applicative f => Bool -> f () -> f () #
The reverse of when
.
void :: Functor f => f a -> f () #
discards or ignores the result of evaluation, such
as the return value of an void
valueIO
action.
Using ApplicativeDo
: '
' can be understood as the
void
asdo
expression
do as pure ()
with an inferred Functor
constraint.
Examples
Replace the contents of a
with unit:Maybe
Int
>>>
void Nothing
Nothing>>>
void (Just 3)
Just ()
Replace the contents of an
with unit, resulting in an Either
Int
Int
:Either
Int
()
>>>
void (Left 8675309)
Left 8675309>>>
void (Right 8675309)
Right ()
Replace every element of a list with unit:
>>>
void [1,2,3]
[(),(),()]
Replace the second element of a pair with unit:
>>>
void (1,2)
(1,())
Discard the result of an IO
action:
>>>
mapM print [1,2]
1 2 [(),()]>>>
void $ mapM print [1,2]
1 2