EarleyM- Monadic Earley Parsing

Safe HaskellNone




Monadic combinators for Earley Parsing


Grammars, Non-terminals, Productions

data Gram a Source

Type of grammars. RHS of a production rules. Synthesizing values of type a. Monadic construction allows context-sensitive grammars.


Monad Gram Source 
Functor Gram Source 
Applicative Gram Source 

alts :: [Gram a] -> Gram a Source

Grammar constructed from a list of alteratives. Alternations may be nested; we are not restricted to just having alternate productions for a given non-terminal.

fail :: Gram a Source

Grammar constructed from no alteratives. fail == alts []

many :: Gram a -> Gram [a] Source

Grammar for kleene *. Constructs an infinite RHS. Use in preference to right-recursion via a non-terminal, which has quadratic inefficiency.

skipWhile :: Gram () -> Gram () Source

Kleene * which ignores the synthesized value. skipWhile p == do _ <- many p; return ()

data Lang t a Source

Type for language definition over terminals (tokens) of type t. A collection of production rules together with an entry point. Constructed monadically.


Monad (Lang t) Source 
Functor (Lang t) Source 
Applicative (Lang t) Source 

getToken :: Lang t (Gram t) Source

Access to the grammar for tokens within a language definition.

data NT a Source

Type of non-terminals. LHS of a production rules. Carrying values of type a.


Show (NT a) Source 

createNamedNT :: Show a => String -> Lang t (NT a) Source

Create a fresh non-terminal within a language definition. The name is only used for debugging and reporting ambiguity. This is a low level primitive. Simpler to use declare.

referenceNT :: NT a -> Gram a Source

Reference a non-terminal on the RHS of a production. This is a low level primitive. Simpler to use declare.

declare :: Show a => String -> Lang t (NT a, Gram a) Source

Convenience combination of createNamedNT and referenceNT, returning a pair of values for a fresh non-terminal, for use on the LHS/RHS.

produce :: NT a -> Gram a -> Lang t () Source

Define a language production, linking the LHS and RHS of the rule.

fix :: Show a => String -> (Gram a -> Lang t (Gram a)) -> Lang t (Gram a) Source

Combination of declare/produce to allow reference to a grammar within its own defintion. Use this for language with left-recursion.


data Parsing a Source

Result of running a parsing function: parse or parseAmb. Combines the outcome with the effort taken.




effort :: Eff
outcome :: a


Functor Parsing Source 

data SyntaxError Source

Type describing a syntax-error encountered during parsing. In all cases the final position reached before the error is reported. This position is automatically determined by the Early parsing algorithm.


parseAmb :: (Show a, Show t) => Lang t (Gram a) -> [t] -> Parsing (Either SyntaxError [a]) Source

Entry-point to run a parse. Allows ambiguity. Parses a list of tokens using a Lang/Gram definition. Returns all parses or a syntax-error.

data Ambiguity Source

Type describing a parse ambiguity for a specific non-terminal (name), across a position range. This may be reported as an error by the parse entry point.


Ambiguity String Pos Pos 


data ParseError Source

Union of SyntaxError and Ambiguity, for reporting errors from parse.


parse :: (Show a, Show t) => Lang t (Gram a) -> [t] -> Parsing (Either ParseError a) Source

Entry-point to run a parse. Rejects ambiguity. Parse a list of tokens using a Lang/Gram definition. Returns the single parse or a parse-error.

newtype Eff Source

Type to represent the effort taken during parsing. Used by some unit-tests.


Eff Int 


Show Eff Source 

type Pos = Int Source

Type of positions. Used in parse error reports. Indicates the index of the input token list reached before the error was encountered.