megaparsec-5.0.0: Monadic parser combinators

Copyright© 2015–2016 Megaparsec contributors © 2007 Paolo Martini © 1999–2001 Daan Leijen
LicenseFreeBSD
MaintainerMark Karpov <markkarpov@opmbx.org>
Stabilityexperimental
Portabilitynon-portable
Safe HaskellNone
LanguageHaskell2010

Text.Megaparsec.Prim

Contents

Description

The primitive parser combinators.

Synopsis

Data types

data State s Source

This is Megaparsec's state, it's parametrized over stream type s.

Constructors

State 

Instances

Eq s => Eq (State s) Source 
Show s => Show (State s) Source 

class Ord (Token s) => Stream s where Source

An instance of Stream s has stream type s. Token type is determined by the stream and can be found via Token type function.

Associated Types

type Token s :: * Source

Type of token in stream.

Since: 5.0.0

Methods

uncons :: s -> Maybe (Token s, s) Source

Get next token from the stream. If the stream is empty, return Nothing.

updatePos Source

Arguments

:: Proxy s

Proxy clarifying stream type (Token is not injective)

-> Pos

Tab width

-> SourcePos

Current position

-> Token s

Current token

-> (SourcePos, SourcePos)

Actual position and incremented position

Update position in stream given tab width, current position, and current token. The result is a tuple where the first element will be used to report parse errors for current token, while the second element is the incremented position that will be stored in parser's state.

When you work with streams where elements do not contain information about their position in input, result is usually consists of the third argument unchanged and incremented position calculated with respect to current token. This is how default instances of Stream work (they use defaultUpdatePos, which may be a good starting point for your own position-advancing function).

When you wish to deal with stream of tokens where every token “knows” its start and end position in input (for example, you have produced the stream with Happy/Alex), then the best strategy is to use the start position as actual element position and provide the end position of the token as incremented one.

Since: 5.0.0

type Parsec e s = ParsecT e s Identity Source

Parsec is non-transformer variant of more general ParsecT monad transformer.

data ParsecT e s m a Source

ParsecT e s m a is a parser with custom data component of error e, stream type s, underlying monad m and return type a.

Primitive combinators

class (ErrorComponent e, Stream s, Alternative m, MonadPlus m) => MonadParsec e s m | m -> e s where Source

Type class describing parsers independent of input type.

Methods

failure Source

Arguments

:: Set (ErrorItem (Token s))

Unexpected items

-> Set (ErrorItem (Token s))

Expected items

-> Set e

Custom data

-> m a 

The most general way to stop parsing and report ParseError.

unexpected is defined in terms of this function:

unexpected item = failure (Set.singleton item) Set.empty Set.empty

Since: 4.2.0

label :: String -> m a -> m a Source

The parser label name p behaves as parser p, but whenever the parser p fails without consuming any input, it replaces names of “expected” tokens with the name name.

hidden :: m a -> m a Source

hidden p behaves just like parser p, but it doesn't show any “expected” tokens in error message when p fails.

try :: m a -> m a Source

The parser try p behaves like parser p, except that it pretends that it hasn't consumed any input when an error occurs.

This combinator is used whenever arbitrary look ahead is needed. Since it pretends that it hasn't consumed any input when p fails, the (<|>) combinator will try its second alternative even when the first parser failed while consuming input.

For example, here is a parser that is supposed to parse word “let” or “lexical”:

>>> parseTest (string "let" <|> string "lexical") "lexical"
1:1:
unexpected "lex"
expecting "let"

What happens here? First parser consumes “le” and fails (because it doesn't see a “t”). The second parser, however, isn't tried, since the first parser has already consumed some input! try fixes this behavior and allows backtracking to work:

>>> parseTest (try (string "let") <|> string "lexical") "lexical"
"lexical"

try also improves error messages in case of overlapping alternatives, because Megaparsec's hint system can be used:

>>> parseTest (try (string "let") <|> string "lexical") "le"
1:1:
unexpected "le"
expecting "let" or "lexical"

Please note that as of Megaparsec 4.4.0, string backtracks automatically (see tokens), so it does not need try. However, the examples above demonstrate the idea behind try so well that it was decided to keep them.

lookAhead :: m a -> m a Source

lookAhead p parses p without consuming any input.

If p fails and consumes some input, so does lookAhead. Combine with try if this is undesirable.

notFollowedBy :: m a -> m () Source

notFollowedBy p only succeeds when parser p fails. This parser does not consume any input and can be used to implement the “longest match” rule.

withRecovery Source

Arguments

:: (ParseError (Token s) e -> m a)

How to recover from failure

-> m a

Original parser

-> m a

Parser that can recover from failures

withRecovery r p allows continue parsing even if parser p fails. In this case r is called with actual ParseError as its argument. Typical usage is to return value signifying failure to parse this particular object and to consume some part of input up to start of next object.

Note that if r fails, original error message is reported as if without withRecovery. In no way recovering parser r can influence error messages.

Since: 4.4.0

eof :: m () Source

This parser only succeeds at the end of the input.

token Source

Arguments

:: (Token s -> Either (Set (ErrorItem (Token s)), Set (ErrorItem (Token s)), Set e) a)

Matching function for the token to parse, it allows to construct arbitrary error message on failure as well; sets in three-tuple are: unexpected items, expected items, and custom data pieces

-> Maybe (Token s)

Token to report when input stream is empty

-> m a 

The parser token test mrep accepts a token t with result x when the function test t returns Right x. mrep may provide representation of the token to report in error messages when input stream in empty.

This is the most primitive combinator for accepting tokens. For example, the satisfy parser is implemented as:

satisfy f = token testChar Nothing
  where
    testChar x =
      if f x
        then Right x
        else Left (Set.singleton (Tokens (x:|[])), Set.empty, Set.empty)

tokens Source

Arguments

:: (Token s -> Token s -> Bool)

Predicate to check equality of tokens

-> [Token s]

List of tokens to parse

-> m [Token s] 

The parser tokens test parses list of tokens and returns it. Supplied predicate test is used to check equality of given and parsed tokens.

This can be used for example to write string:

string = tokens (==)

Note that beginning from Megaparsec 4.4.0, this is an auto-backtracking primitive, which means that if it fails, it never consumes any input. This is done to make its consumption model match how error messages for this primitive are reported (which becomes an important thing as user gets more control with primitives like withRecovery):

>>> parseTest (string "abc") "abd"
1:1:
unexpected "abd"
expecting "abc"

This means, in particular, that it's no longer necessary to use try with tokens-based parsers, such as string and string'. This feature does not affect performance in any way.

getParserState :: m (State s) Source

Returns the full parser state as a State record.

updateParserState :: (State s -> State s) -> m () Source

updateParserState f applies function f to the parser state.

Instances

MonadParsec e s m => MonadParsec e s (IdentityT m) Source 
(Monoid w, MonadParsec e s m) => MonadParsec e s (WriterT w m) Source 
(Monoid w, MonadParsec e s m) => MonadParsec e s (WriterT w m) Source 
MonadParsec e s m => MonadParsec e s (ReaderT st m) Source 
MonadParsec e s m => MonadParsec e s (StateT st m) Source 
MonadParsec e s m => MonadParsec e s (StateT st m) Source 
(ErrorComponent e, Stream s) => MonadParsec e s (ParsecT e s m) Source 

(<?>) :: MonadParsec e s m => m a -> String -> m a infix 0 Source

A synonym for label in form of an operator.

unexpected :: MonadParsec e s m => ErrorItem (Token s) -> m a Source

The parser unexpected item always fails with an error message telling about unexpected item item without consuming any input.

Parser state combinators

getInput :: MonadParsec e s m => m s Source

Return the current input.

setInput :: MonadParsec e s m => s -> m () Source

setInput input continues parsing with input. The getInput and setInput functions can for example be used to deal with include files.

getPosition :: MonadParsec e s m => m SourcePos Source

Return the current source position.

See also: setPosition, pushPosition, popPosition, and SourcePos.

setPosition :: MonadParsec e s m => SourcePos -> m () Source

setPosition pos sets the current source position to pos.

See also: getPosition, pushPosition, popPosition, and SourcePos.

pushPosition :: MonadParsec e s m => SourcePos -> m () Source

Push given position into stack of positions and continue parsing working with this position. Useful for working with include files and the like.

See also: getPosition, setPosition, popPosition, and SourcePos.

Since: 5.0.0

popPosition :: MonadParsec e s m => m () Source

Pop a position from stack of positions unless it only contains one element (in that case stack of positions remains the same). This is how to return to previous source file after pushPosition.

See also: getPosition, setPosition, pushPosition, and SourcePos.

Since: 5.0.0

getTabWidth :: MonadParsec e s m => m Pos Source

Return tab width. Default tab width is equal to defaultTabWidth. You can set different tab width with help of setTabWidth.

setTabWidth :: MonadParsec e s m => Pos -> m () Source

Set tab width. If argument of the function is not positive number, defaultTabWidth will be used.

setParserState :: MonadParsec e s m => State s -> m () Source

setParserState st set the full parser state to st.

Running parser

runParser Source

Arguments

:: Parsec e s a

Parser to run

-> String

Name of source file

-> s

Input for parser

-> Either (ParseError (Token s) e) a 

runParser p file input runs parser p on the input list of tokens input, obtained from source file. The file is only used in error messages and may be the empty string. Returns either a ParseError (Left) or a value of type a (Right).

parseFromFile p file = runParser p file <$> readFile file

runParser' Source

Arguments

:: Parsec e s a

Parser to run

-> State s

Initial state

-> (State s, Either (ParseError (Token s) e) a) 

The function is similar to runParser with the difference that it accepts and returns parser state. This allows to specify arbitrary textual position at the beginning of parsing, for example. This is the most general way to run a parser over the Identity monad.

Since: 4.2.0

runParserT Source

Arguments

:: Monad m 
=> ParsecT e s m a

Parser to run

-> String

Name of source file

-> s

Input for parser

-> m (Either (ParseError (Token s) e) a) 

runParserT p file input runs parser p on the input list of tokens input, obtained from source file. The file is only used in error messages and may be the empty string. Returns a computation in the underlying monad m that returns either a ParseError (Left) or a value of type a (Right).

runParserT' Source

Arguments

:: Monad m 
=> ParsecT e s m a

Parser to run

-> State s

Initial state

-> m (State s, Either (ParseError (Token s) e) a) 

This function is similar to runParserT, but like runParser' it accepts and returns parser state. This is thus the most general way to run a parser.

Since: 4.2.0

parse Source

Arguments

:: Parsec e s a

Parser to run

-> String

Name of source file

-> s

Input for parser

-> Either (ParseError (Token s) e) a 

parse p file input runs parser p over Identity (see runParserT if you're using the ParsecT monad transformer; parse itself is just a synonym for runParser). It returns either a ParseError (Left) or a value of type a (Right). parseErrorPretty can be used to turn ParseError into the string representation of the error message. See Text.Megaparsec.Error if you need to do more advanced error analysis.

main = case (parse numbers "" "11,2,43") of
         Left err -> putStr (parseErrorPretty err)
         Right xs -> print (sum xs)

numbers = integer `sepBy` char ','

parseMaybe :: (ErrorComponent e, Stream s) => Parsec e s a -> s -> Maybe a Source

parseMaybe p input runs parser p on input and returns result inside Just on success and Nothing on failure. This function also parses eof, so if the parser doesn't consume all of its input, it will fail.

The function is supposed to be useful for lightweight parsing, where error messages (and thus file name) are not important and entire input should be parsed. For example it can be used when parsing of single number according to specification of its format is desired.

parseTest Source

Arguments

:: (ShowErrorComponent e, Ord (Token s), ShowToken (Token s), Show a) 
=> Parsec e s a

Parser to run

-> s

Input for parser

-> IO () 

The expression parseTest p input applies a parser p against input input and prints the result to stdout. Useful for testing.