fungll-combinators-0.1.0.1: GLL parser with simple combinator interface

Safe HaskellNone
LanguageHaskell2010

GLL.ParserCombinators

Contents

Synopsis

Elementary parsers

term_parser :: Parseable t => t -> (t -> a) -> SymbExpr t a Source #

Create a symbol-parse for a terminal given:

  • The Parseable token represented by the terminal.
  • A function from that Parseable to a semantic result.

satisfy :: (Show t, Ord t) => a -> AltExpr t a Source #

The empty right-hand side that yields its first argument as a semantic result.

Elementary parsers using the Token datatype

keychar :: (Parseable t, SubsumesToken t) => Char -> SymbExpr t Char Source #

Parse a single character, using a SubsumesToken type.

keyword :: (Parseable t, SubsumesToken t) => String -> SymbExpr t String Source #

Parse a single character, using a SubsumesToken type.

int_lit :: (Parseable t, SubsumesToken t) => SymbExpr t Int Source #

Parse a single integer, using a SubsumesToken type. Returns the lexeme interpreted as an Int.

float_lit :: (Parseable t, SubsumesToken t) => SymbExpr t Double Source #

Parse a single floating point literal, using a SubsumesToken type. Returns the lexeme interpreted as a Double.

bool_lit :: (Parseable t, SubsumesToken t) => SymbExpr t Bool Source #

Parse a single Boolean, using a SubsumesToken type. Returns the lexeme interpreter as a Boolean.

char_lit :: (Parseable t, SubsumesToken t) => SymbExpr t Char Source #

Parse a single Character literal, using a SubsumesToken type. Returns the lexeme interpreted as a Character literal.

string_lit :: (Parseable t, SubsumesToken t) => SymbExpr t String Source #

Parse a single String literal, using a SubsumesToken type. Returns the lexeme interpreted as a String literal.

alt_id_lit :: (Parseable t, SubsumesToken t) => SymbExpr t String Source #

Parse a single alternative identifier, using a SubsumesToken type. Returns the lexeme as a String.

id_lit :: (Parseable t, SubsumesToken t) => SymbExpr t String Source #

Parse a single identifier, using a SubsumesToken type. Returns the lexeme as a String.

token :: (Parseable t, SubsumesToken t) => String -> SymbExpr t String Source #

Parse a single arbitrary token, using a SubsumesToken type. Returns the lexeme.

Elementary character-level parsers

char :: Char -> SymbExpr Char Char Source #

Parse a single character.

char c = term_parser c id

Currently, this is the only character-level combinator exported by this module. Please use token-level combinators for practical parsing. Might change in the future.

Elementary combinators

Sequencing

(<**>) :: (Show t, Ord t, IsAltExpr i, IsSymbExpr s) => i t (a -> b) -> s t a -> AltExpr t b infixl 4 Source #

Add a SymbExpr to the right-hand side represented by an AltExpr creating a new AltExpr. The semantic result of the first argument is applied to the second as a cross-product.

Choice

(<||>) :: (Show t, Ord t, IsAltExpr i, HasAlts b) => i t a -> b t a -> AltExprs t a infixr 3 Source #

Add an AltExpr to a list of AltExpr The resuling '[] :. AltExpr' forms the right-hand side of a rule.

Semantic actions

(<$$>) :: (Show t, Ord t, IsSymbExpr s) => (a -> b) -> s t a -> AltExpr t b infixl 4 Source #

Form an AltExpr by mapping some semantic action overy the result of the second argument.

Nonterminal introduction

(<:=>) :: (Show t, Ord t, HasAlts b) => String -> b t a -> SymbExpr t a infixl 2 Source #

Form a rule by giving the name of the left-hand side of the new rule. Use this combinator on recursive non-terminals.

(<::=>) :: (Show t, Ord t, HasAlts b) => String -> b t a -> SymbExpr t a infixl 2 Source #

Variant of <:=> for recursive non-terminals that have a potentially infinite number of derivations for some input string.

A non-terminal yields infinitely many derivations if and only if it is left-recursive and would be left-recursive if all the right-hand sides of the productions of the grammar are reversed.

chooses :: (Show t, Ord t, IsAltExpr alt) => String -> [alt t a] -> SymbExpr t a Source #

Variant of <::=> that can be supplied with a list of alternates

chooses_prec :: (Show t, Ord t, IsAltExpr alt) => String -> [alt t a] -> SymbExpr t a Source #

Variant of <::= that can be supplied with a list of alternates

Types

Grammar (combinator expression) types

type BNF t a = SymbExpr t a Source #

A combinator expression representing a BNF-grammar. The terminals of the grammar are of type t. When used to parse, the expression yields semantic results of type a.

data SymbExpr t a Source #

A combinator expression representing a symbol. A SymbExpr either represents a terminal or a nonterminal. In the latter case it is constructed with (a variant of) <:=> and adds a rule to the grammar of which the represented symbol is the left-hand side.

Instances
IsAltExpr SymbExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toAlt :: (Show t, Ord t) => SymbExpr t b -> AltExpr t b Source #

HasAlts SymbExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

altsOf :: (Show t, Ord t) => SymbExpr t b -> [AltExpr t b] Source #

IsSymbExpr SymbExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toSymb :: (Show t, Ord t) => SymbExpr t b -> SymbExpr t b Source #

mkRule :: (Show t, Ord t) => SymbExpr t b -> BNF t b Source #

data AltExpr t a Source #

A combinator expression representing an alternative: the right-hand side of a production.

Instances
IsAltExpr AltExprs Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toAlt :: (Show t, Ord t) => AltExprs t b -> AltExpr t b Source #

IsAltExpr AltExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toAlt :: (Show t, Ord t) => AltExpr t b -> AltExpr t b Source #

HasAlts AltExprs Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

altsOf :: (Show t, Ord t) => AltExprs t b -> [AltExpr t b] Source #

HasAlts AltExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

altsOf :: (Show t, Ord t) => AltExpr t b -> [AltExpr t b] Source #

IsSymbExpr AltExprs Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toSymb :: (Show t, Ord t) => AltExprs t b -> SymbExpr t b Source #

mkRule :: (Show t, Ord t) => AltExprs t b -> BNF t b Source #

IsSymbExpr AltExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toSymb :: (Show t, Ord t) => AltExpr t b -> SymbExpr t b Source #

mkRule :: (Show t, Ord t) => AltExpr t b -> BNF t b Source #

type AltExprs = OO [] AltExpr Source #

A list of alternatives represents the right-hand side of a rule.

Parseable token types

data Token #

A datatype for representing tokens with some builtins and an aribitrary Token constructor. This datatype stores (optional) lexemes.

Constructors

Char Char 
Keyword String 
EOS 
Epsilon 
IntLit (Maybe Int) 
FloatLit (Maybe Double) 
BoolLit (Maybe Bool) 
StringLit (Maybe String) 
CharLit (Maybe Char) 
IDLit (Maybe String) 
AltIDLit (Maybe String)

alternative identifiers, for example functions vs. constructors (as in Haskell).

Token String (Maybe String) 
Instances
Eq Token 
Instance details

Defined in GLL.Types.Grammar

Methods

(==) :: Token -> Token -> Bool #

(/=) :: Token -> Token -> Bool #

Ord Token 
Instance details

Defined in GLL.Types.Grammar

Methods

compare :: Token -> Token -> Ordering #

(<) :: Token -> Token -> Bool #

(<=) :: Token -> Token -> Bool #

(>) :: Token -> Token -> Bool #

(>=) :: Token -> Token -> Bool #

max :: Token -> Token -> Token #

min :: Token -> Token -> Token #

Show Token 
Instance details

Defined in GLL.Types.Grammar

Methods

showsPrec :: Int -> Token -> ShowS #

show :: Token -> String #

showList :: [Token] -> ShowS #

Parseable Token 
Instance details

Defined in GLL.Types.Grammar

Methods

eos :: Token #

eps :: Token #

matches :: Token -> Token -> Bool #

unlex :: Token -> String #

SubsumesToken Token 
Instance details

Defined in GLL.Types.Grammar

class (Ord a, Eq a, Show a) => Parseable a where #

Class that captures elements of an input string (tokens).

  • eos is the end-of-string symbol
  • eps is the empty-string symbol

Both eos and eps must be distinct from eachother and from all tokens in the input string. The show instance is required to throw error messages.

Minimal complete definition

eos, eps, matches

Methods

eos :: a #

eps :: a #

matches :: a -> a -> Bool #

This function is used for matching grammar tokens and input tokens. Override this method if, for example, your input tokens store lexemes while the grammar tokens do not

unlex :: a -> String #

This function pretty-prints the Parseable type by displaying its lexeme. Default implementation is show, which should be replaced for prettier error messages.

Instances
Parseable Token 
Instance details

Defined in GLL.Types.Grammar

Methods

eos :: Token #

eps :: Token #

matches :: Token -> Token -> Bool #

unlex :: Token -> String #

class SubsumesToken a where #

Class whose members are super-types of Token.

Methods

upcast :: Token -> a #

downcast :: a -> Maybe Token #

Instances
SubsumesToken Token 
Instance details

Defined in GLL.Types.Grammar

unlexTokens :: [Token] -> String #

Pretty-prints a list of Tokens as a concatenation of their lexemes.

Running a parser

parse :: (Show t, Parseable t, IsSymbExpr s) => s t a -> [t] -> [a] Source #

Runs a parser given a string of Parseables and returns a list of semantic results, corresponding to all finitely many derivations.

printParseData :: (Parseable t, IsSymbExpr s, Show a) => s t a -> [t] -> IO () Source #

Print some information about the parse. Helpful for debugging.

evaluatorWithParseData :: (Parseable t, IsSymbExpr s, Show a) => s t a -> [t] -> [a] Source #

Print some information

Running a parser with options

parseWithOptions :: (Show t, Parseable t, IsSymbExpr s) => CombinatorOptions -> s t a -> [t] -> [a] Source #

Run the parser with some CombinatorOptions.

parseWithParseOptions :: (Show t, Parseable t, IsSymbExpr s) => ParseOptions -> CombinatorOptions -> s t a -> [t] -> [a] Source #

Run the parser with some ParseOptions and CombinatorOptions.

printParseDataWithOptions :: (Parseable t, IsSymbExpr s, Show a) => ParseOptions -> CombinatorOptions -> s t a -> [t] -> IO () Source #

Variant of printParseData which can be controlled by ParseOptions

Possible options

type CombinatorOptions = [CombinatorOption] #

A list of CombinatorOptions for evaluating combinator expressions.

type CombinatorOption = PCOptions -> PCOptions #

A single option.

maximumErrors :: Int -> CombinatorOption #

Set the maximum number of errors shown in case of an unsuccessful parse.

throwErrors :: CombinatorOption #

If there are no parse results, the default behaviour is to return an empty list. If this option is used, a runtime error will be reported, with debugging information.

maximumPivot :: CombinatorOption #

Enables a 'longest-match' at production level.

maximumPivotAtNt :: CombinatorOption #

Enables 'longest-match' at non-terminal level.

leftBiased :: CombinatorOption #

Turns all occurrences of <||> into a 'left biased' variant: only return results of the second alternate if the first alternate does not have any results.

Running a parser with options and explicit failure

Runing a parser to obtain ParseResult.

Builtin lexers.

default_lexer :: SubsumesToken t => String -> [t] #

A lexer using the default LexerSettings.

Lexer settings

lexer :: SubsumesToken t => LexerSettings -> String -> [t] #

A lexer parameterised by LexerSettings.

data LexerSettings #

Settings for changing the behaviour of the builtin lexer lexer. Lexers are built using Text.Regex.Applicative.

Constructors

LexerSettings 

Fields

Derived combinators

mkNt :: (Show t, Ord t, IsSymbExpr s) => s t a -> String -> String Source #

Helper function for defining new combinators. Use mkNt to form a new unique non-terminal name based on the symbol of a given SymbExpr and a String that is unique to the newly defined combinator.

Ignoring semantic results

(<$$) :: (Show t, Ord t, IsSymbExpr s) => b -> s t a -> AltExpr t b infixl 4 Source #

Variant of <$$> that ignores the semantic result of its second argument.

(**>) :: (Show t, Ord t, IsAltExpr i, IsSymbExpr s) => i t a -> s t b -> AltExpr t b infixl 4 Source #

Variant of <**> that ignores the semantic result of the first argument.

(<**) :: (Show t, Ord t, IsAltExpr i, IsSymbExpr s) => i t a -> s t b -> AltExpr t a infixl 4 Source #

Variant of <**> that ignores the semantic result of the second argument.

EBNF patterns

optional :: (Show t, Ord t, IsSymbExpr s) => s t a -> SymbExpr t (Maybe a) Source #

Derive either from the given symbol or the empty string.

preferably :: (Show t, Ord t, IsSymbExpr s) => s t a -> SymbExpr t (Maybe a) Source #

Version of optional that prefers to derive from the given symbol, affects only nullable nonterminal symbols

reluctantly :: (Show t, Ord t, IsSymbExpr s) => s t a -> SymbExpr t (Maybe a) Source #

Version of optional that prefers to derive the empty string from the given symbol, affects only nullable nonterminal symbols

optionalWithDef :: (Show t, Ord t, IsSymbExpr s) => s t a -> a -> SymbExpr t a Source #

multiple :: (Show t, Ord t, IsSymbExpr s) => s t a -> SymbExpr t [a] Source #

Try to apply a parser multiple times (0 or more). The results are returned in a list. In the case of ambiguity the largest list is returned.

multiple1 :: (Show t, Ord t, IsSymbExpr s) => s t a -> SymbExpr t [a] Source #

Try to apply a parser multiple times (1 or more). The results are returned in a list. In the case of ambiguity the largest list is returned.

multipleSepBy :: (Show t, Ord t, IsSymbExpr s, IsSymbExpr s2, IsAltExpr s2) => s t a -> s2 t b -> SymbExpr t [a] Source #

Same as multiple but with an additional separator.

multipleSepBy1 :: (Show t, Ord t, IsSymbExpr s, IsSymbExpr s2, IsAltExpr s2) => s t a -> s2 t b -> SymbExpr t [a] Source #

Same as multiple1 but with an additional separator.

multipleSepBy2 :: (Show t, Ord t, IsSymbExpr s2, IsSymbExpr s, IsAltExpr s2) => s t a -> s2 t b -> BNF t [a] Source #

Like multipleSepBy1 but matching at least two occurrences of the first argument. The returned list is therefore always of at least length 2. At least one separator will be consumed.

within :: (Show t, Ord t, IsSymbExpr s) => BNF t a -> s t b -> BNF t c -> BNF t b Source #

Place a piece of BNF within two other BNF fragments, ignoring their semantics.

parens :: (IsSymbExpr s, Parseable t, SubsumesToken t) => s t b -> BNF t b Source #

Place a piece of BNF between the characters '(' and ')'.

braces :: (IsSymbExpr s, Parseable t, SubsumesToken t) => s t b -> BNF t b Source #

Place a piece of BNF between the characters '{' and '}'.

brackets :: (IsSymbExpr s, Parseable t, SubsumesToken t) => s t b -> BNF t b Source #

Place a piece of BNF between the characters '[' and ']'.

angles :: (IsSymbExpr s, Parseable t, SubsumesToken t) => s t b -> BNF t b Source #

Place a piece of BNF between the characters < and >.

foldr_multiple :: (IsSymbExpr s, Parseable t) => s t (a -> a) -> a -> BNF t a Source #

foldr_multipleSepBy :: (IsSymbExpr s, Parseable t) => s t (a -> a) -> s t b -> a -> BNF t a Source #

Operator expressions

fromOpTable :: (SubsumesToken t, Parseable t, IsSymbExpr s) => String -> OpTable e -> s t e -> BNF t e Source #

type OpTable e = Map Double [(String, Fixity e)] Source #

A table mapping operator keywords to a Fixity and Assoc It provides a convenient way to build an expression grammar (see fromOpTable).

data Assoc Source #

Constructors

LAssoc 
RAssoc 
NA 

data Fixity e Source #

Constructors

Prefix (String -> e -> e) 
Infix (e -> String -> e -> e) Assoc 

Disambiguation

(<:=) :: (Show t, Ord t, HasAlts b) => String -> b t a -> SymbExpr t a infixl 2 Source #

Variant of <:=> that prioritises productions from left-to-right (or top-to-bottom).

(<::=) :: (Show t, Ord t, HasAlts b) => String -> b t a -> SymbExpr t a infixl 2 Source #

Variant of <::=> that prioritises productions from left-to-right (or top-to-bottom).

(<<<**>) :: (Show t, Ord t, IsAltExpr i, IsSymbExpr s) => i t (a -> b) -> s t a -> AltExpr t b infixl 4 Source #

Variant of <**> that applies shortest match on the left operand.

(<**>>>) :: (Show t, Ord t, IsAltExpr i, IsSymbExpr s) => i t (a -> b) -> s t a -> AltExpr t b infixl 4 Source #

Variant of <**> that applies longest match on the left operand.

(<<**>) :: (Show t, Ord t, IsAltExpr i, IsSymbExpr s) => i t a -> s t b -> AltExpr t b infixl 4 Source #

(<<<**) :: (Show t, Ord t, IsAltExpr i, IsSymbExpr s) => i t a -> s t b -> AltExpr t a infixl 4 Source #

Variant <** that applies shortest match on its left operand

(**>>>) :: (Show t, Ord t, IsAltExpr i, IsSymbExpr s) => i t a -> s t b -> AltExpr t b infixl 4 Source #

(<**>>) :: (Show t, Ord t, IsAltExpr i, IsSymbExpr s) => i t a -> s t b -> AltExpr t a infixl 4 Source #

Variant of <** that applies longest match on its left operand.

longest_match :: (Show t, Ord t, IsAltExpr alt) => alt t a -> AltExpr t a Source #

Apply this combinator to an alternative to turn all underlying occurrences of <**> (or variants) apply 'longest match'.

shortest_match :: (Show t, Ord t, IsAltExpr alt) => alt t a -> AltExpr t a Source #

many :: (Show t, Ord t, IsSymbExpr s) => s t a -> SymbExpr t [a] Source #

Try to apply a parser multiple times (0 or more) with shortest match applied to each occurrence of the parser.

many1 :: (Show t, Ord t, IsSymbExpr s) => s t a -> SymbExpr t [a] Source #

Try to apply a parser multiple times (1 or more) with shortest match applied to each occurrence of the parser.

some :: (Show t, Ord t, IsSymbExpr s) => s t a -> SymbExpr t [a] Source #

Try to apply a parser multiple times (0 or more) with longest match applied to each occurrence of the parser.

some1 :: (Show t, Ord t, IsSymbExpr s) => s t a -> SymbExpr t [a] Source #

Try to apply a parser multiple times (1 or more) with longest match applied to each occurrence of the parser.

manySepBy :: (Show t, Ord t, IsSymbExpr s, IsSymbExpr s2, IsAltExpr s2) => s t a -> s2 t b -> SymbExpr t [a] Source #

Same as many but with an additional separator.

manySepBy1 :: (Show t, Ord t, IsSymbExpr s, IsSymbExpr s2, IsAltExpr s2) => s t a -> s2 t b -> SymbExpr t [a] Source #

Same as many1 but with an additional separator.

manySepBy2 :: (Show t, Ord t, IsSymbExpr s2, IsSymbExpr s, IsAltExpr s2) => s t a -> s2 t b -> BNF t [a] Source #

Like multipleSepBy2 but matching the maximum number of occurrences of the first argument as possible (at least 2).

someSepBy :: (Show t, Ord t, IsSymbExpr s, IsSymbExpr s2, IsAltExpr s2) => s t a -> s2 t b -> SymbExpr t [a] Source #

Same as some1 but with an additional separator.

someSepBy1 :: (Show t, Ord t, IsSymbExpr s, IsSymbExpr s2, IsAltExpr s2) => s t a -> s2 t b -> SymbExpr t [a] Source #

Same as some1 but with an additional separator.

someSepBy2 :: (Show t, Ord t, IsSymbExpr s2, IsSymbExpr s, IsAltExpr s2) => s t a -> s2 t b -> BNF t [a] Source #

Like multipleSepBy2 but matching the minimum number of occurrences of the first argument as possible (at least 2).

Lifting

class HasAlts a where Source #

Class for lifting to AltExprs.

Methods

altsOf :: (Show t, Ord t) => a t b -> [AltExpr t b] Source #

Instances
HasAlts AltExprs Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

altsOf :: (Show t, Ord t) => AltExprs t b -> [AltExpr t b] Source #

HasAlts AltExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

altsOf :: (Show t, Ord t) => AltExpr t b -> [AltExpr t b] Source #

HasAlts SymbExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

altsOf :: (Show t, Ord t) => SymbExpr t b -> [AltExpr t b] Source #

class IsSymbExpr a where Source #

Class for lifting to SymbExpr.

Minimal complete definition

toSymb

Methods

toSymb :: (Show t, Ord t) => a t b -> SymbExpr t b Source #

mkRule :: (Show t, Ord t) => a t b -> BNF t b Source #

Synonym of toSymb for creating derived combinators.

Instances
IsSymbExpr AltExprs Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toSymb :: (Show t, Ord t) => AltExprs t b -> SymbExpr t b Source #

mkRule :: (Show t, Ord t) => AltExprs t b -> BNF t b Source #

IsSymbExpr AltExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toSymb :: (Show t, Ord t) => AltExpr t b -> SymbExpr t b Source #

mkRule :: (Show t, Ord t) => AltExpr t b -> BNF t b Source #

IsSymbExpr SymbExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toSymb :: (Show t, Ord t) => SymbExpr t b -> SymbExpr t b Source #

mkRule :: (Show t, Ord t) => SymbExpr t b -> BNF t b Source #

class IsAltExpr a where Source #

Class for lifting to AltExpr.

Methods

toAlt :: (Show t, Ord t) => a t b -> AltExpr t b Source #

Instances
IsAltExpr AltExprs Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toAlt :: (Show t, Ord t) => AltExprs t b -> AltExpr t b Source #

IsAltExpr AltExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toAlt :: (Show t, Ord t) => AltExpr t b -> AltExpr t b Source #

IsAltExpr SymbExpr Source # 
Instance details

Defined in GLL.Combinators.Visit.Join

Methods

toAlt :: (Show t, Ord t) => SymbExpr t b -> AltExpr t b Source #

Memoisation

memo :: (Ord t, Show t, IsSymbExpr s) => MemoRef [a] -> s t a -> SymbExpr t a Source #

This function memoises a parser, given:

Use memo on those parsers that are expected to derive the same substring multiple times. If the same combinator expression is used to parse multiple times the MemoRef needs to be cleared using memClear.

memo relies on unsafePerformIO and is therefore potentially unsafe. The option useMemoisation enables memoisation. It is off by default, even if memo is used in a combinator expression.

newMemoTable :: MemoRef a #

Create a reference to a fresh MemoTable.

memClear :: MemoRef a -> IO () #

Clears the MemoTable to which the given reference refers.

type MemoTable a = IntMap (IntMap a) #

A MemoTable maps left-extent l to right-extent r to some results a indicating the the substring ranging from l to r is derived with parse result a.

type MemoRef a = IORef (MemoTable a) #

An impure reference to a MemoTable.

useMemoisation :: CombinatorOption #

Whether to use unsafe memoisation to speed up the enumeration of parse results.