Earley-0.12.0.1: Parsing all context-free grammars using Earley's algorithm.

Text.Earley

Description

Parsing all context-free grammars using Earley's algorithm.

Synopsis

# Context-free grammars

data Prod r e t a Source #

A production.

The type parameters are:

a: The return type of the production.

t for terminal: The type of the terminals that the production operates on.

e for expected: The type of names, used for example to report expected tokens.

r for rule: The type of a non-terminal. This plays a role similar to the s in the type ST s a. Since the parser function expects the r to be universally quantified, there is not much to do with this parameter other than leaving it universally quantified.

As an example, Prod r String Char Int is the type of a production that returns an Int, operates on (lists of) characters and reports String names.

Most of the functionality of Prods is obtained through its instances, e.g. Functor, Applicative, and Alternative.

Instances

 Functor (Prod r e t) Source # Methodsfmap :: (a -> b) -> Prod r e t a -> Prod r e t b #(<$) :: a -> Prod r e t b -> Prod r e t a # Applicative (Prod r e t) Source # Methodspure :: a -> Prod r e t a #(<*>) :: Prod r e t (a -> b) -> Prod r e t a -> Prod r e t b #(*>) :: Prod r e t a -> Prod r e t b -> Prod r e t b #(<*) :: Prod r e t a -> Prod r e t b -> Prod r e t a # Alternative (Prod r e t) Source # Methodsempty :: Prod r e t a #(<|>) :: Prod r e t a -> Prod r e t a -> Prod r e t a #some :: Prod r e t a -> Prod r e t [a] #many :: Prod r e t a -> Prod r e t [a] # (IsString t, Eq t, (~) * a t) => IsString (Prod r e t a) Source # String literals can be interpreted as Terminals that match that string.>>> :set -XOverloadedStrings >>> import Data.Text (Text) >>> let determiner = "the" <|> "a" <|> "an" :: Prod r e Text Text  MethodsfromString :: String -> Prod r e t a # Monoid (Prod r e t a) Source # Methodsmempty :: Prod r e t a #mappend :: Prod r e t a -> Prod r e t a -> Prod r e t a #mconcat :: [Prod r e t a] -> Prod r e t a # terminal :: (t -> Maybe a) -> Prod r e t a Source # Match a token for which the given predicate returns Just a, and return the a. (<?>) :: Prod r e t a -> e -> Prod r e t a infixr 0 Source # A named production (used for reporting expected things). data Grammar r a Source # A context-free grammar. The type parameters are: a: The return type of the grammar (often a Prod). r for rule: The type of a non-terminal. This plays a role similar to the s in the type ST s a. Since the parser function expects the r to be universally quantified, there is not much to do with this parameter other than leaving it universally quantified. Most of the functionality of Grammars is obtained through its instances, e.g. Monad and MonadFix. Note that GHC has syntactic sugar for MonadFix: use {-# LANGUAGE RecursiveDo #-} and mdo instead of do. Instances  Monad (Grammar r) Source # Methods(>>=) :: Grammar r a -> (a -> Grammar r b) -> Grammar r b #(>>) :: Grammar r a -> Grammar r b -> Grammar r b #return :: a -> Grammar r a #fail :: String -> Grammar r a # Source # Methodsfmap :: (a -> b) -> Grammar r a -> Grammar r b #(<$) :: a -> Grammar r b -> Grammar r a # Source # Methodsmfix :: (a -> Grammar r a) -> Grammar r a # Source # Methodspure :: a -> Grammar r a #(<*>) :: Grammar r (a -> b) -> Grammar r a -> Grammar r b #(*>) :: Grammar r a -> Grammar r b -> Grammar r b #(<*) :: Grammar r a -> Grammar r b -> Grammar r a #

rule :: Prod r e t a -> Grammar r (Prod r e t a) Source #

Create a new non-terminal by giving its production.

# Derived operators

satisfy :: (t -> Bool) -> Prod r e t t Source #

Match a token that satisfies the given predicate. Returns the matched token.

token :: Eq t => t -> Prod r e t t Source #

Match a single token.

namedToken :: Eq t => t -> Prod r t t t Source #

Match a single token and give it the name of the token.

list :: Eq t => [t] -> Prod r e t [t] Source #

Match a list of tokens in sequence.

listLike :: (Eq t, ListLike i t) => i -> Prod r e t i Source #

Match a ListLike of tokens in sequence.

# Deprecated operators

symbol :: Eq t => t -> Prod r e t t Source #

Deprecated: Use token instead

namedSymbol :: Eq t => t -> Prod r e t t Source #

Deprecated: Use namedToken instead

word :: Eq t => [t] -> Prod r e t [t] Source #

Deprecated: Use list or listLike instead

# Parsing

data Report e i Source #

A parsing report, which contains fields that are useful for presenting errors to the user if a parse is deemed a failure. Note however that we get a report even when we successfully parse something.

Constructors

 Report Fieldsposition :: IntThe final position in the input (0-based) that the parser reached.expected :: [e]The named productions processed at the final position.unconsumed :: iThe part of the input string that was not consumed, which may be empty.

Instances

data Result s e i a Source #

The result of a parse.

Constructors

 Ended (Report e i) The parser ended. Parsed (ST s [a]) Int i (ST s (Result s e i a)) The parser parsed a number of as. These are given as a computation, ST s [a] that constructs the as when run. We can thus save some work by ignoring this computation if we do not care about the results. The Int is the position in the input where these results were obtained, the i the rest of the input, and the last component is the continuation.

Instances

 Functor (Result s e i) Source # Methodsfmap :: (a -> b) -> Result s e i a -> Result s e i b #(<\$) :: a -> Result s e i b -> Result s e i a #

type Parser e i a = forall s. i -> ST s (Result s e i a) Source #

parser :: ListLike i t => (forall r. Grammar r (Prod r e t a)) -> Parser e i a Source #

Create a parser from the given grammar.

allParses :: Parser e i a -> i -> ([(a, Int)], Report e i) Source #

Return all parses from the result of a given parser. The result may contain partial parses. The Ints are the position at which a result was produced.

The elements of the returned list of results are sorted by their position in ascending order. If there are multiple results at the same position they are returned in an unspecified order.

fullParses :: ListLike i t => Parser e i a -> i -> ([a], Report e i) Source #

Return all parses that reached the end of the input from the result of a given parser.

If there are multiple results they are returned in an unspecified order.

# Recognition

report :: Parser e i a -> i -> Report e i Source #

See e.g. how far the parser is able to parse the input string before it fails. This can be much faster than getting the parse results for highly ambiguous grammars.

# Language generation

type Generator t a = forall s. ST s (Result s t a) Source #

generator :: (forall r. Grammar r (Prod r e t a)) -> [t] -> Generator t a Source #

Create a language generator for given grammar and list of allowed tokens.

language :: Generator t a -> [(a, [t])] Source #

Run a generator, returning all members of the language.

The members are returned as parse results paired with the list of tokens used to produce the result. The elements of the returned list of results are sorted by their length in ascending order. If there are multiple results of the same length they are returned in an unspecified order.

upTo :: Int -> Generator t a -> [(a, [t])] Source #

upTo n gen runs the generator gen, returning all members of the language that are of length less than or equal to n.

The members are returned as parse results paired with the list of tokens used to produce the result. The elements of the returned list of results are sorted by their length in ascending order. If there are multiple results of the same length they are returned in an unspecified order.

exactly :: Int -> Generator t a -> [(a, [t])] Source #

exactly n gen runs the generator gen, returning all members of the language that are of length equal to n.

The members are returned as parse results paired with the list of tokens used to produce the result. If there are multiple results they are returned in an unspecified order.