EarleyM-0.1.0.0: Monadic Earley Parsing

Safe HaskellNone
LanguageHaskell2010

EarleyM

Contents

Description

Monadic combinators for Earley Parsing

Synopsis

Grammars, Non-terminals, Productions

data Gram a Source

Type of grammars. RHS of a production rules. Synthesizing values of type a. Monadic construction allows context-sensitive grammars.

Instances

Monad Gram Source 
Functor Gram Source 
Applicative Gram Source 

alts :: [Gram a] -> Gram a Source

Grammar constructed from a list of alteratives. Alternations may be nested; we are not restricted to just having alternate productions for a given non-terminal.

fail :: Gram a Source

Grammar constructed from no alteratives. fail == alts []

many :: Gram a -> Gram [a] Source

Grammar for kleene *. Constructs an infinite RHS. Use in preference to right-recursion via a non-terminal, which has quadratic inefficiency.

skipWhile :: Gram () -> Gram () Source

Kleene * which ignores the synthesized value. skipWhile p == do _ <- many p; return ()

data Lang t a Source

Type for language definition over terminals (tokens) of type t. A collection of production rules together with an entry point. Constructed monadically.

Instances

Monad (Lang t) Source 
Functor (Lang t) Source 
Applicative (Lang t) Source 

getToken :: Lang t (Gram t) Source

Access to the grammar for tokens within a language definition.

data NT a Source

Type of non-terminals. LHS of a production rules. Carrying values of type a.

Instances

Show (NT a) Source 

createNamedNT :: Show a => String -> Lang t (NT a) Source

Create a fresh non-terminal within a language definition. The name is only used for debugging and reporting ambiguity. This is a low level primitive. Simpler to use declare.

referenceNT :: NT a -> Gram a Source

Reference a non-terminal on the RHS of a production. This is a low level primitive. Simpler to use declare.

declare :: Show a => String -> Lang t (NT a, Gram a) Source

Convenience combination of createNamedNT and referenceNT, returning a pair of values for a fresh non-terminal, for use on the LHS/RHS.

produce :: NT a -> Gram a -> Lang t () Source

Define a language production, linking the LHS and RHS of the rule.

fix :: Show a => String -> (Gram a -> Lang t (Gram a)) -> Lang t (Gram a) Source

Combination of declare/produce to allow reference to a grammar within its own defintion. Use this for language with left-recursion.

Parsing

data Parsing a Source

Result of running a parsing function: parse or parseAmb. Combines the outcome with the effort taken.

Constructors

Parsing 

Fields

effort :: Eff
 
outcome :: a
 

Instances

Functor Parsing Source 

data SyntaxError Source

Type describing a syntax-error encountered during parsing. In all cases the final position reached before the error is reported. This position is automatically determined by the Early parsing algorithm.

Instances

parseAmb :: (Show a, Show t) => Lang t (Gram a) -> [t] -> Parsing (Either SyntaxError [a]) Source

Entry-point to run a parse. Allows ambiguity. Parses a list of tokens using a Lang/Gram definition. Returns all parses or a syntax-error.

data Ambiguity Source

Type describing a parse ambiguity for a specific non-terminal (name), across a position range. This may be reported as an error by the parse entry point.

Constructors

Ambiguity String Pos Pos 

Instances

data ParseError Source

Union of SyntaxError and Ambiguity, for reporting errors from parse.

Instances

parse :: (Show a, Show t) => Lang t (Gram a) -> [t] -> Parsing (Either ParseError a) Source

Entry-point to run a parse. Rejects ambiguity. Parse a list of tokens using a Lang/Gram definition. Returns the single parse or a parse-error.

newtype Eff Source

Type to represent the effort taken during parsing. Used by some unit-tests.

Constructors

Eff Int 

Instances

Show Eff Source 

type Pos = Int Source

Type of positions. Used in parse error reports. Indicates the index of the input token list reached before the error was encountered.