Earley-0.9.0: Parsing all context-free grammars using Earley's algorithm.

Safe HaskellNone
LanguageHaskell2010

Text.Earley

Contents

Description

Parsing all context-free grammars using Earley's algorithm.

Synopsis

Context-free grammars

data Prod r e t a Source

A production.

The type parameters are:

a: The return type of the production.

t: The type of the terminals that the production operates on.

e: The type of names, used for example to report expected tokens.

r: The type of a non-terminal. This plays a role similar to the s in the type ST s a. Since the parser function expects the r to be universally quantified, there is not much to do with this parameter other than leaving it universally quantified.

As an example, Prod r String Char Int is the type of a production that returns an Int, operates on (lists of) characters and reports String names.

Most of the functionality of Prods is obtained through its instances, e.g. Functor, Applicative, and Alternative.

Instances

Functor (Prod r e t) Source 
Applicative (Prod r e t) Source 
Alternative (Prod r e t) Source 
Monoid (Prod r e t a) Source 

satisfy :: (t -> Bool) -> Prod r e t t Source

Match a token that satisfies the given predicate. Returns the matched token.

(<?>) :: Prod r e t a -> e -> Prod r e t a infixr 0 Source

A named production (used for reporting expected things).

data Grammar r e a Source

A context-free grammar.

The type parameters are:

a: The return type of the grammar (often a Prod).

e: The type of names, used for example to report expected tokens.

r: The type of a non-terminal. This plays a role similar to the s in the type ST s a. Since the parser function expects the r to be universally quantified, there is not much to do with this parameter other than leaving it universally quantified.

Most of the functionality of Grammars is obtained through its instances, e.g. Monad and MonadFix. Note that GHC has syntactic sugar for MonadFix: use {-# LANGUAGE RecursiveDo #-} and mdo instead of do.

rule :: Prod r e t a -> Grammar r e (Prod r e t a) Source

Create a new non-terminal by giving its production.

Derived operators

symbol :: Eq t => t -> Prod r e t t Source

Match a single token.

namedSymbol :: Eq t => t -> Prod r t t t Source

Match a single token and give it the name of the token.

word :: Eq t => [t] -> Prod r e t [t] Source

Match a list of tokens in sequence.

Parsing

data Report e i Source

A parsing report, which contains fields that are useful for presenting errors to the user if a parse is deemed a failure. Note however that we get a report even when we successfully parse something.

Constructors

Report 

Fields

position :: Int

The final position in the input (0-based) that the parser reached.

expected :: [e]

The named productions processed at the final position.

unconsumed :: i

The part of the input string that was not consumed, which may be empty.

Instances

(Eq e, Eq i) => Eq (Report e i) Source 
(Ord e, Ord i) => Ord (Report e i) Source 
(Read e, Read i) => Read (Report e i) Source 
(Show e, Show i) => Show (Report e i) Source 

data Result s e i a Source

The result of a parse.

Constructors

Ended (Report e i)

The parser ended.

Parsed (ST s [a]) Int i (ST s (Result s e i a))

The parser parsed a number of as. These are given as a computation, ST s [a] that constructs the as when run. We can thus save some work by ignoring this computation if we do not care about the results. The Int is the position in the input where these results were obtained, the i the rest of the input, and the last component is the continuation.

Instances

parser :: ListLike i t => (forall r. Grammar r e (Prod r e t a)) -> i -> ST s (Result s e i a) Source

Create a parser from the given grammar.

allParses :: (forall s. ST s (Result s e i a)) -> ([(a, Int)], Report e i) Source

Return all parses from the result of a given parser. The result may contain partial parses. The Ints are the position at which a result was produced.

fullParses :: ListLike i t => (forall s. ST s (Result s e i a)) -> ([a], Report e i) Source

Return all parses that reached the end of the input from the result of a given parser.

Recognition

report :: ListLike i t => (forall s. ST s (Result s e i a)) -> Report e i Source

See e.g. how far the parser is able to parse the input string before it fails. This can be much faster than getting the parse results for highly ambiguous grammars.