Safe Haskell	None

Data.Attoparsec.Text.Parsec

Contents

Writing parsers that behave consistent across Attoparsec and Parsec
Parser type
Running parsers
Combinators
Parsing individual characters
- Special character parsers
- Character classes
Efficient string handling
- Consume all remaining input
Text parsing
Numeric parsers
State observation and manipulation functions

Description

This module implements Data.Attoparsec.Text in terms of Parsec. It can be used to write parsers that can be compiled against both Attoparsec and Parsec.

Differences from Data.Attoparsec.Text:

Incremental input is not supported.
satisfyWith, skip, scan, and most of the numeric parsers are not yet implemented. Patches are gladly welcome!
Parsec parsers (and hence the parsers provided here) do not automatically backtrack on failing alternatives that consumed input. With careful use of try it is possible to write parsers that behave consistent across Attoparsec and Parsec. Read the next section for more on that.

A simple usage example is here: https://github.com/sol/attoparsec-parsec#readme

Synopsis

Writing parsers that behave consistent across Attoparsec and Parsec

Some care is needed, so that parsers behave consistent across Attoparsec and Parsec in regards to backtracking. Attoparsec parsers always backtrack on failure. In contrast, a Parsec parser that fails after it has consumed input will not automatically backtrack, but it can be turned into backtracking parsers with try.

Here is an example that illustrates the difference. The following parser will fail under Parsec given an input of "for":

string "foo" <|> string "for"

The reason for its failure is that the first branch is a partial match, and will consume the letters 'f' and 'o' before failing. In Attoparsec, the above parser will succeed on that input, because the failed first branch will consume nothing.

The try function can be used to write parsers that behave consistent across Attoparsec and Parsec. Each alternative that may fail after consuming input, has to be prefixed with try. E.g. for the parser above we would write:

try (string "foo") <|> string "for"

For Parsec try enables backtracking, for Attoparsec it's just a type-constrained version of id (see Attoparsec's try).

Parser type

type Parser = Parsec Text ()

Running parsers

parseOnly :: Parser a -> Text -> Either String aSource

Combinators

(<?>)Source

Arguments

:: Parser a
-> String	the name to use if parsing fails
-> Parser a

Name the parser, in case failure occurs.

See Parsec's documentation of <?> for detailed semantics.

try :: Parser a -> Parser aSource

Attempt a parse, and if it fails, rewind the input so that no input appears to have been consumed.

See Parsec's documentation of try for detailed semantics.

module Data.Attoparsec.Combinator

Parsing individual characters

char :: Char -> Parser Char Source

Match a specific character.

anyChar :: Parser Char Source

Match any character.

notChar :: Char -> Parser Char Source

Match any character except the given one.

satisfy :: (Char -> Bool) -> Parser Char Source

The parser satisfy p succeeds for any character for which the predicate p returns True. Returns the character that is actually parsed.

digit = satisfy isDigit
    where isDigit c = c >= '0' && c <= '9'

peekChar :: Parser (Maybe Char)Source

Special character parsers

digit :: Parser Char Source

Parse a single digit, as recognised by isDigit.

letter :: Parser Char Source

Parse a letter, as recognised by isAlpha.

space :: Parser Char Source

Parse a space character, as recognised by isSpace.

Character classes

inClass :: String -> Char -> Bool

Match any character in a set.

vowel = inClass "aeiou"

Range notation is supported.

halfAlphabet = inClass "a-nA-N"

To add a literal '-' to a set, place it at the beginning or end of the string.

notInClass :: String -> Char -> Bool

Match any character not in a set.

Efficient string handling

string :: Text -> Parser Text Source

string s parses a sequence of characters that identically match s. Returns the parsed string (i.e. s).

stringCI :: Text -> Parser Text Source

Satisfy a literal string, ignoring case.

Note: No proper case folding is done, yet. Currently stringCI s is just

 char (toLower c) <|> char (toUpper c)

for each character of s. The implementation from Data.Attoparsec.Text tries to do proper case folding, but is actually buggy (see https://github.com/bos/attoparsec/issues/6). As long as you deal with characters from the ASCII range, both implementations should be fine.

skipSpace :: Parser ()Source

Skip over white space.

skipWhile :: (Char -> Bool) -> Parser ()Source

Skip past input for as long as the predicate returns True.

take :: Int -> Parser Text Source

Consume exactly n characters of input.

takeWhile :: (Char -> Bool) -> Parser Text Source

Consume input as long as the predicate returns True, and return the consumed input.

This parser does not fail. It will return an empty string if the predicate returns False on the first character of input.

Note: Because this parser does not fail, do not use it with combinators such as many, because such parsers loop until a failure occurs. Careless use will thus result in an infinite loop.

takeWhile1 :: (Char -> Bool) -> Parser Text Source

Consume input as long as the predicate returns True, and return the consumed input.

This parser requires the predicate to succeed on at least one character of input: it will fail if the predicate never returns True or if there is no input left.

takeTill :: (Char -> Bool) -> Parser Text Source

Consume input as long as the predicate returns False (i.e. until it returns True), and return the consumed input.

This parser does not fail. It will return an empty string if the predicate returns True on the first character of input.

Note: Because this parser does not fail, do not use it with combinators such as many, because such parsers loop until a failure occurs. Careless use will thus result in an infinite loop.