parsec2-1.0.1: Monadic parser combinators

Copyright(c) Daan Leijen 1999-2001
LicenseBSD-style (see the file libraries/parsec/LICENSE)
MaintainerAntoine Latter <aslatter@gmail.com>
Stabilityprovisional
Portabilityportable
Safe HaskellSafe
LanguageHaskell98

Text.ParserCombinators.Parsec.Prim

Description

The primitive parser combinators.

Synopsis

Documentation

(<?>) :: GenParser tok st a -> String -> GenParser tok st a infix 0 Source

The parser p ? msg behaves as parser p, but whenever the parser p fails without consuming any input, it replaces expect error messages with the expect error message msg.

This is normally used at the end of a set alternatives where we want to return an error message in terms of a higher level construct rather than returning all possible characters. For example, if the expr parser from the try example would fail, the error message is: '...: expecting expression'. Without the (<?>) combinator, the message would be like '...: expecting "let" or letter', which is less friendly.

(<|>) :: GenParser tok st a -> GenParser tok st a -> GenParser tok st a infixr 1 Source

This combinator implements choice. The parser p <|> q first applies p. If it succeeds, the value of p is returned. If p fails without consuming any input, parser q is tried. This combinator is defined equal to the mplus member of the MonadPlus class and the (<|>) member of Alternative.

The parser is called predictive since q is only tried when parser p didn't consume any input (i.e.. the look ahead is 1). This non-backtracking behaviour allows for both an efficient implementation of the parser combinators and the generation of good error messages.

type Parser a = GenParser Char () a Source

data GenParser tok st a Source

runParser :: GenParser tok st a -> st -> SourceName -> [tok] -> Either ParseError a Source

The most general way to run a parser. runParser p state filePath input runs parser p on the input list of tokens input, obtained from source filePath with the initial user state st. The filePath is only used in error messages and may be the empty string. Returns either a ParseError (Left) or a value of type a (Right).

 parseFromFile p fname
   = do{ input <- readFile fname
       ; return (runParser p () fname input)
       }

parse :: GenParser tok () a -> SourceName -> [tok] -> Either ParseError a Source

parse p filePath input runs a parser p without user state. The filePath is only used in error messages and may be the empty string. Returns either a ParseError (Left) or a value of type a (Right).

 main    = case (parse numbers "" "11, 2, 43") of
            Left err  -> print err
            Right xs  -> print (sum xs)

 numbers = commaSep integer

parseTest :: Show a => GenParser tok () a -> [tok] -> IO () Source

The expression parseTest p input applies a parser p against input input and prints the result to stdout. Used for testing parsers.

token :: (tok -> String) -> (tok -> SourcePos) -> (tok -> Maybe a) -> GenParser tok st a Source

The parser token showTok posFromTok testTok accepts a token t with result x when the function testTok t returns Just x. The source position of the t should be returned by posFromTok t and the token can be shown using showTok t.

This combinator is expressed in terms of tokenPrim. It is used to accept user defined token streams. For example, suppose that we have a stream of basic tokens tupled with source positions. We can than define a parser that accepts single tokens as:

 mytoken x
   = token showTok posFromTok testTok
   where
     showTok (pos,t)     = show t
     posFromTok (pos,t)  = pos
     testTok (pos,t)     = if x == t then Just t else Nothing

tokens :: Eq tok => ([tok] -> String) -> (SourcePos -> [tok] -> SourcePos) -> [tok] -> GenParser tok st [tok] Source

tokenPrim :: (tok -> String) -> (SourcePos -> tok -> [tok] -> SourcePos) -> (tok -> Maybe a) -> GenParser tok st a Source

The parser token showTok nextPos testTok accepts a token t with result x when the function testTok t returns Just x. The token can be shown using showTok t. The position of the next token should be returned when nextPos is called with the current source position pos, the current token t and the rest of the tokens toks, nextPos pos t toks.

This is the most primitive combinator for accepting tokens. For example, the char parser could be implemented as:

 char c
   = tokenPrim showChar nextPos testChar
   where
     showChar x        = "'" ++ x ++ "'"
     testChar x        = if x == c then Just x else Nothing
     nextPos pos x xs  = updatePosChar pos x

tokenPrimEx :: (tok -> String) -> (SourcePos -> tok -> [tok] -> SourcePos) -> Maybe (SourcePos -> tok -> [tok] -> st -> st) -> (tok -> Maybe a) -> GenParser tok st a Source

The most primitive token recogniser. The expression tokenPrimEx show nextpos mbnextstate test, recognises tokens when test returns Just x (and returns the value x). Tokens are shown in error messages using show. The position is calculated using nextpos, and finally, mbnextstate, can hold a function that updates the user state on every token recognised (nice to count tokens :-). The function is packed into a Maybe type for performance reasons.

try :: GenParser tok st a -> GenParser tok st a Source

The parser try p behaves like parser p, except that it pretends that it hasn't consumed any input when an error occurs.

This combinator is used whenever arbitrary look ahead is needed. Since it pretends that it hasn't consumed any input when p fails, the (<|>) combinator will try its second alternative even when the first parser failed while consuming input.

The try combinator can for example be used to distinguish identifiers and reserved words. Both reserved words and identifiers are a sequence of letters. Whenever we expect a certain reserved word where we can also expect an identifier we have to use the try combinator. Suppose we write:

 expr        = letExpr <|> identifier <?> "expression"

 letExpr     = do{ string "let"; ... }
 identifier  = many1 letter

If the user writes "lexical", the parser fails with: unexpected 'x', expecting 't' in "let". Indeed, since the (<|>) combinator only tries alternatives when the first alternative hasn't consumed input, the identifier parser is never tried (because the prefix "le" of the string "let" parser is already consumed). The right behaviour can be obtained by adding the try combinator:

 expr        = letExpr <|> identifier <?> "expression"

 letExpr     = do{ try (string "let"); ... }
 identifier  = many1 letter

label :: GenParser tok st a -> String -> GenParser tok st a Source

labels :: GenParser tok st a -> [String] -> GenParser tok st a Source

unexpected :: String -> GenParser tok st a Source

The parser unexpected msg always fails with an unexpected error message msg without consuming any input.

The parsers fail, (<?>) and unexpected are the three parsers used to generate error messages. Of these, only (<?>) is commonly used. For an example of the use of unexpected, see the definition of notFollowedBy.

pzero :: GenParser tok st a Source

many :: GenParser tok st a -> GenParser tok st [a] Source

many p applies the parser p zero or more times. Returns a list of the returned values of p.

 identifier  = do{ c  <- letter
                 ; cs <- many (alphaNum <|> char '_')
                 ; return (c:cs)
                 }

skipMany :: GenParser tok st a -> GenParser tok st () Source

skipMany p applies the parser p zero or more times, skipping its result.

 spaces  = skipMany space

getState :: GenParser tok st st Source

Returns the current user state.

setState :: st -> GenParser tok st () Source

setState st set the user state to st.

updateState :: (st -> st) -> GenParser tok st () Source

updateState f applies function f to the user state. Suppose that we want to count identifiers in a source, we could use the user state as:

 expr  = do{ x <- identifier
           ; updateState (+1)
           ; return (Id x)
           }

getPosition :: GenParser tok st SourcePos Source

Returns the current source position. See also SourcePos.

setPosition :: SourcePos -> GenParser tok st () Source

setPosition pos sets the current source position to pos.

getInput :: GenParser tok st [tok] Source

Returns the current input

setInput :: [tok] -> GenParser tok st () Source

setInput input continues parsing with input.

data State tok st Source

Constructors

State 

Fields

stateInput :: [tok]
 
statePos :: !SourcePos
 
stateUser :: !st
 

getParserState :: GenParser tok st (State tok st) Source

Returns the full parser state as a State record.

setParserState :: State tok st -> GenParser tok st (State tok st) Source

setParserState st set the full parser state to st.