| Safe Haskell | Safe-Inferred |
|---|---|
| Language | Haskell2010 |
Little.Earley
Description
Utilities for interpreting context-free grammars as Earley parsers and for processing parse trees.
The library is parameterized by types for non-terminal symbols n,
terminal symbols t, and tokens c.
- Non-terminal symbols
nare associated with production rules. - Terminal symbols
tdenote sets of tokensc. - Lists of tokens
[c]form the actual inputs.
Example
import Data.Char (isDigit) import Little.Earley data Ns = SUM | PRODUCT | FACTOR | NUMBER deriving (Eq, Ord, Enum, Bounded, Show) data Ts = Digit | OneOf [Char] deriving (Eq, Ord, Show) arithRules :: Ns -> [RuleNs Ts] arithRules n = case n of SUM -> [ [NPRODUCT ] , [NSUM,T(OneOf ['+', '-']),NPRODUCT ] ] PRODUCT -> [ [NFACTOR ] , [NPRODUCT,T(OneOf ['*', '/']),NFACTOR ] ] FACTOR -> [ [NNUMBER ] , [T(OneOf ['(']),NSUM,T(OneOf [')']) ] ] NUMBER -> [ [TDigit ] , [TDigit,NNUMBER ] ] matchTs :: Ts -> Char -> Bool matchTs Digit = isDigit matchTs (OneOf s) = (`elem` s) arithG ::GrammarNs Ts Char arithG =mkGrammararithRules matchTs
Load that file in GHCi, then parse a string using that grammar:
> pparse arithG SUM "1+2*3"
See also Little.Earley.Examples for some examples of grammars.
Synopsis
- parse :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Result n t c
- pparse :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Pretty (Result n t c)
- parseTreeSet :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Maybe (TreeSet n t c)
- accepts :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Bool
- data Result n t c
- = ParseError (Error t c)
- | ParseSuccess (ParsedResult n t c)
- data ParsedResult n t c = ParsedResult {
- prParseTree :: TreeSet n t c
- prAmbiguities :: [LocAmbiguity n t c]
- data Error t c = Error {
- errorPosition :: Int
- errorToken :: Maybe c
- errorExpected :: [t]
- errorInput :: [c]
- data Grammar n t c = Grammar {}
- mkGrammar :: (Ord n, Bounded n, Enum n) => (n -> [Rule n t]) -> (t -> c -> Bool) -> Grammar n t c
- type Rule n t = [Atom n t]
- data Atom n t
- data RuleId n = RuleId n Int
- data TreeT f n t c
- type Tree = TreeT NoExt
- type TreeSet = TreeT Choice
- type TruncatedTree = TreeT (Sum Ellipsis NoExt)
- type TruncatedTreeSet = TreeT (Sum Ellipsis Choice)
- data NoExt a
- data Choice a = a :|: a
- class HasChoice f
- (|:) :: HasChoice f => TreeT f n t c -> TreeT f n t c -> TreeT f n t c
- data Ellipsis a = Ellipsis
- ellipsis :: TreeT (Sum Ellipsis f) n t c
- data Sum f g a
- fromSingleton :: TreeSet n t c -> Maybe (Tree n t c)
- arbTree :: TreeSet n t c -> Tree n t c
- truncateTree :: Functor f => Int -> TreeT f n t c -> TreeT (Sum Ellipsis f) n t c
- ambiguities :: TreeSet n t c -> [LocAmbiguity n t c]
- data Range = Range {}
- data Ambiguity n t c = Ambiguity (Tree n t c) (Tree n t c)
- type LocAmbiguity n t c = (Range, Ambiguity n t c)
- class PrettyPrint a where
- prettyPrint :: a -> String
- newtype Pretty a = Pretty a
- drawTree :: (n -> String) -> (c -> String) -> Tree n t c -> [String]
- prettyTree :: (PrettyPrint n, PrettyPrint c) => Tree n t c -> [String]
Basic interface
parse :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Result n t c Source #
Parse a chain of tokens [c] given a grammar and a starting symbol n.
Variants:
pparse: pretty-printed; use this in the REPL.parseTreeSet: outputs just the parse tree.accepts: outputs a mere boolean.
Example
parse arithG SUM "1+2*3"
pparse :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Pretty (Result n t c) Source #
Wrapped parse with a pretty-printed result. Use this in the REPL.
Example
pparse arithG SUM "1+2*3"
Output:
+-----+--SUM #1---+
| | |
SUM #0 | +PRODUCT #1-+
| | | | |
PRODUCT #0 | PRODUCT #0 | |
| | | | |
FACTOR #0 | FACTOR #0 | FACTOR #0
| | | | |
NUMBER #0 | NUMBER #0 | NUMBER #0
| | | | |
-----------------------------------
1 + 2 * 3parseTreeSet :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Maybe (TreeSet n t c) Source #
Parse a chain of tokens [c] into a parse tree.
Simplified variant of parse.
accepts :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Bool Source #
Check whether a grammar matches a chain of character [c] from a starting symbol n.
Result of parse.
Constructors
| ParseError (Error t c) | |
| ParseSuccess (ParsedResult n t c) |
Instances
| (PrettyPrint n, PrettyPrint t, PrettyPrint c) => PrettyPrint (Result n t c) Source # | |
Defined in Little.Earley.Internal.Result Methods prettyPrint :: Result n t c -> String Source # | |
data ParsedResult n t c Source #
Successful result of parse.
Constructors
| ParsedResult | |
Fields
| |
Parser error information.
Constructors
| Error | |
Fields
| |
Grammars
Grammars with non-terminal symbols n, terminal symbols t, and tokens c.
A grammar defines a language, which is a set of sequences of tokens c.
Two basic choices for t and c are:
t =andCharTc = Char, with: then the inputmatch=matchCharT[c]is aString.t = Stringandc = String, with: then the inputmatch= (==)[c]is a[String], which can be produced usingwords; just remember to put spaces around operators and parentheses.
See also examples in Little.Earley.Examples.
Constructors
| Grammar | |
Fields
| |
mkGrammar :: (Ord n, Bounded n, Enum n) => (n -> [Rule n t]) -> (t -> c -> Bool) -> Grammar n t c Source #
Construct a grammar given the fields rules and match,
implicitly populating isNullable.
An atom is either a non-terminal or a terminal.
A rule can be identified by a non-terminal and an index into all of the associated rules of that non-terminal.
Parse trees
Generalized parse tree.
A basic parse tree (Tree) consists of leaves labeled terminal symbols t (Leaf)
and nodes labeled with grammar rules associated to nonterminal symbols (().Brch)
Other variants of parse trees (TreeSet, TruncatedTreeSet) can be represented
using extension nodes (Ext).
Trees may be infinite due to an input string matching infinitely many parse trees.
Note that even though StrictData is enabled, we get laziness via the list type []
and tuple type (,).
Constructors
| Leaf Int t c | The |
| Brch (RuleId n) Int Int [TreeT f n t c] | The |
| Ext (f (TreeT f n t c)) |
type TruncatedTree = TreeT (Sum Ellipsis NoExt) Source #
Result of truncateTree applied to a Tree.
type TruncatedTreeSet = TreeT (Sum Ellipsis Choice) Source #
Result of truncateTree applied to a TreeSet.
Parse tree modifiers
Choice constructor to represent TreeSet.
Constructors
| a :|: a infixr 1 |
Overloaded version of (.:|:)
Minimal complete definition
(|:) :: HasChoice f => TreeT f n t c -> TreeT f n t c -> TreeT f n t c infixr 1 Source #
Construct the disjunction of two trees featuring the Choice functor.
Ellided by truncateTree.
Constructors
| Ellipsis |
Like Sum from Data.Functor.Sum but with more basic instances
Operations on parse trees
arbTree :: TreeSet n t c -> Tree n t c Source #
Get an arbitrary Tree from a TreeSet, even if it is ambiguous.
truncateTree :: Functor f => Int -> TreeT f n t c -> TreeT (Sum Ellipsis f) n t c Source #
Truncate a tree to finite depth.
truncateTree:: Int ->TreeSetn t c ->TruncatedTreeSetn t ctruncateTree:: Int ->Treen t c ->TruncatedTreen t c
ambiguities :: TreeSet n t c -> [LocAmbiguity n t c] Source #
Enumerate (some) ambiguous parses.
If there are multiple ambiguities at the same location, we just pick an arbitrary example.
An interval in some input sequence.
Evidence of ambiguity: two parse trees for the same input.
type LocAmbiguity n t c = (Range, Ambiguity n t c) Source #
Ambiguity at a given location.
Pretty-printing
class PrettyPrint a where Source #
A class for ad hoc pretty-printers.
Minimal complete definition
Nothing
Instances
| PrettyPrint Char Source # | Display the character without quotes. |
Defined in Little.Earley.Internal.Pretty Methods prettyPrint :: Char -> String Source # | |
| Show a => PrettyPrint a Source # | |
Defined in Little.Earley.Internal.PrettyOrphan Methods prettyPrint :: a -> String Source # | |
| PrettyPrint String Source # | Display the string without quotes. |
Defined in Little.Earley.Internal.Pretty Methods prettyPrint :: String -> String Source # | |
| (PrettyPrint n, PrettyPrint t, PrettyPrint c) => PrettyPrint (Result n t c) Source # | |
Defined in Little.Earley.Internal.Result Methods prettyPrint :: Result n t c -> String Source # | |
Wrapper whose Show instance uses PrettyPrint.
This provides a convenient and explicit way to display results nicely in the
REPL. See also pparse.
Constructors
| Pretty a |
drawTree :: (n -> String) -> (c -> String) -> Tree n t c -> [String] Source #
Draw a tree in the terminal.
prettyTree :: (PrettyPrint n, PrettyPrint c) => Tree n t c -> [String] Source #
drawTree using prettyPrint to show symbols.