comparse

comparse is a parser combinator library supporting arbitrary input types.
Combinators do not care about the exact type of the input stream, allowing you
to use them on your own data source, as long as it supports a specific set of
operations. The ParserT monad transformer can wrap around another monad,
allowing you to easily implement features such as tracing, context-sensitive
parsing, state machines, and so on. comparse does its best to provide
meaningful error messages without sacrificing performances or the developer
experience when writing parsers.
How?
The main difference between comparse and other equivalent libraries is that
the parsers you write are generic over the exact type of the input stream. That
means you can write parsers that work with both String, Text or any other
source of Chars with no modification. The combinators in comparse also work
with anything that's an instance of Stream, even if it's not a Stream of
Chars (e.g. a TokenStream!).
Example
The following example shows how to use the comparse library to parse a
simplified subset of the JSON data format:
-- TypeFamilies is required by the `CharParser` constraint
{-# LANGUAGE TypeFamilies #-}
import Control.Monad (void)
import Control.Monad.Parser
import Data.Char (isAlpha, isDigit)
import Data.Text (Text)
import qualified Data.Text as T
data JValue
= JString String
| JNumber Int
| JBool Bool
| JNull
| JObject [(String, JValue)]
| JArray [JValue]
deriving (Show)
main :: IO ()
main = do
input <- getContents
putStrLn "Parsing as String:"
print (parseString input)
putStrLn "Parsing as Text:"
print (parseText $ T.pack input)
parseString :: String -> Maybe JValue
parseString s =
case runStringParser (json <* eof) s of
Parsed v _ _ -> Just v
_ -> Nothing
parseText :: Text -> Maybe JValue
parseText t =
case runTextParser (json <* eof) t of
Parsed v _ _ -> Just v
_ -> Nothing
lexeme :: CharParser p => p a -> p a
lexeme p = spaces *> p <* spaces
where
spaces = void $ many $ oneOf " \n\r\t"
symbol :: CharParser p => String -> p String
symbol = lexeme . string
json :: CharParser p => p JValue
json =
JString <$> stringLiteral
<|> JNumber <$> number
<|> JBool <$> bool
<|> JNull <$ symbol "null"
<|> JObject <$> object
<|> JArray <$> array
stringLiteral :: CharParser p => p String
stringLiteral = lexeme $ like '"' *> many (unlike '\"') <* like '"'
number :: CharParser p => p Int
number =
lexeme
( read <$> ((:) <$> like '-' <*> many1 digit)
<|> read <$> (optional (like '+') *> many1 digit)
)
<* notFollowedBy (match isAlpha)
where
digit = match isDigit
bool :: CharParser p => p Bool
bool = True <$ symbol "true" <|> False <$ symbol "false"
object :: CharParser p => p [(String, JValue)]
object =
symbol "{"
*> sepBy ((,) <$> (stringLiteral <* symbol ":") <*> json) (symbol ",")
<* symbol "}"
array :: CharParser p => p [JValue]
array = symbol "[" *> sepBy json (symbol ",") <* symbol "]"
The CharParser p constraint lets us easily write parsers specialised to
character input streams. It's actually defined as a type alias:
type ParserOf i p = (MonadParser p, Item (Input p) ~ i)
type CharParser p = ParserOf Char p
This design makes it very easy to write reusable parsers and combinators.
For more examples, please refer to the tests.