Safe Haskell | Safe-Inferred |
---|---|
Language | Haskell2010 |
Utilities for interpreting context-free grammars as Earley parsers and for processing parse trees.
The library is parameterized by types for non-terminal symbols n
,
terminal symbols t
, and tokens c
.
- Non-terminal symbols
n
are associated with production rules. - Terminal symbols
t
denote sets of tokensc
. - Lists of tokens
[c]
form the actual inputs.
Example
import Data.Char (isDigit) import Little.Earley data Ns = SUM | PRODUCT | FACTOR | NUMBER deriving (Eq, Ord, Enum, Bounded, Show) data Ts = Digit | OneOf [Char] deriving (Eq, Ord, Show) arithRules :: Ns -> [Rule
Ns Ts] arithRules n = case n of SUM -> [ [N
PRODUCT ] , [N
SUM,T
(OneOf ['+', '-']),N
PRODUCT ] ] PRODUCT -> [ [N
FACTOR ] , [N
PRODUCT,T
(OneOf ['*', '/']),N
FACTOR ] ] FACTOR -> [ [N
NUMBER ] , [T
(OneOf ['(']),N
SUM,T
(OneOf [')']) ] ] NUMBER -> [ [T
Digit ] , [T
Digit,N
NUMBER ] ] matchTs :: Ts -> Char -> Bool matchTs Digit = isDigit matchTs (OneOf s) = (`elem` s) arithG ::Grammar
Ns Ts Char arithG =mkGrammar
arithRules matchTs
Load that file in GHCi, then parse a string using that grammar:
> pparse
arithG SUM "1+2*3"
See also Little.Earley.Examples for some examples of grammars.
Synopsis
- parse :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Result n t c
- pparse :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Pretty (Result n t c)
- parseTreeSet :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Maybe (TreeSet n t c)
- accepts :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Bool
- data Result n t c
- = ParseError (Error t c)
- | ParseSuccess (ParsedResult n t c)
- data ParsedResult n t c = ParsedResult {
- prParseTree :: TreeSet n t c
- prAmbiguities :: [LocAmbiguity n t c]
- data Error t c = Error {
- errorPosition :: Int
- errorToken :: Maybe c
- errorExpected :: [t]
- errorInput :: [c]
- data Grammar n t c = Grammar {}
- mkGrammar :: (Ord n, Bounded n, Enum n) => (n -> [Rule n t]) -> (t -> c -> Bool) -> Grammar n t c
- type Rule n t = [Atom n t]
- data Atom n t
- data RuleId n = RuleId n Int
- data TreeT f n t c
- type Tree = TreeT NoExt
- type TreeSet = TreeT Choice
- type TruncatedTree = TreeT (Sum Ellipsis NoExt)
- type TruncatedTreeSet = TreeT (Sum Ellipsis Choice)
- data NoExt a
- data Choice a = a :|: a
- class HasChoice f
- (|:) :: HasChoice f => TreeT f n t c -> TreeT f n t c -> TreeT f n t c
- data Ellipsis a = Ellipsis
- ellipsis :: TreeT (Sum Ellipsis f) n t c
- data Sum f g a
- fromSingleton :: TreeSet n t c -> Maybe (Tree n t c)
- arbTree :: TreeSet n t c -> Tree n t c
- truncateTree :: Functor f => Int -> TreeT f n t c -> TreeT (Sum Ellipsis f) n t c
- ambiguities :: TreeSet n t c -> [LocAmbiguity n t c]
- data Range = Range {}
- data Ambiguity n t c = Ambiguity (Tree n t c) (Tree n t c)
- type LocAmbiguity n t c = (Range, Ambiguity n t c)
- class PrettyPrint a where
- prettyPrint :: a -> String
- newtype Pretty a = Pretty a
- drawTree :: (n -> String) -> (c -> String) -> Tree n t c -> [String]
- prettyTree :: (PrettyPrint n, PrettyPrint c) => Tree n t c -> [String]
Basic interface
parse :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Result n t c Source #
Parse a chain of tokens [c]
given a grammar and a starting symbol n
.
Variants:
pparse
: pretty-printed; use this in the REPL.parseTreeSet
: outputs just the parse tree.accepts
: outputs a mere boolean.
Example
parse
arithG SUM "1+2*3"
pparse :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Pretty (Result n t c) Source #
Wrapped parse
with a pretty-printed result. Use this in the REPL.
Example
pparse
arithG SUM "1+2*3"
Output:
+-----+--SUM #1---+ | | | SUM #0 | +PRODUCT #1-+ | | | | | PRODUCT #0 | PRODUCT #0 | | | | | | | FACTOR #0 | FACTOR #0 | FACTOR #0 | | | | | NUMBER #0 | NUMBER #0 | NUMBER #0 | | | | | ----------------------------------- 1 + 2 * 3
parseTreeSet :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Maybe (TreeSet n t c) Source #
Parse a chain of tokens [c]
into a parse tree.
Simplified variant of parse
.
accepts :: (Ord n, Ord t) => Grammar n t c -> n -> [c] -> Bool Source #
Check whether a grammar matches a chain of character [c]
from a starting symbol n
.
Result of parse
.
ParseError (Error t c) | |
ParseSuccess (ParsedResult n t c) |
Instances
(PrettyPrint n, PrettyPrint t, PrettyPrint c) => PrettyPrint (Result n t c) Source # | |
Defined in Little.Earley.Internal.Result prettyPrint :: Result n t c -> String Source # |
data ParsedResult n t c Source #
Successful result of parse
.
ParsedResult | |
|
Parser error information.
Error | |
|
Grammars
Grammars with non-terminal symbols n
, terminal symbols t
, and tokens c
.
A grammar defines a language, which is a set of sequences of tokens c
.
Two basic choices for t
and c
are:
t =
andCharT
c = Char
, with
: then the inputmatch
=matchCharT
[c]
is aString
.t = String
andc = String
, with
: then the inputmatch
= (==)[c]
is a[String]
, which can be produced usingwords
; just remember to put spaces around operators and parentheses.
See also examples in Little.Earley.Examples.
Grammar | |
|
mkGrammar :: (Ord n, Bounded n, Enum n) => (n -> [Rule n t]) -> (t -> c -> Bool) -> Grammar n t c Source #
Construct a grammar given the fields rules
and match
,
implicitly populating isNullable
.
An atom is either a non-terminal or a terminal.
A rule can be identified by a non-terminal and an index into all of the associated rules of that non-terminal.
Parse trees
Generalized parse tree.
A basic parse tree (Tree
) consists of leaves labeled terminal symbols t
(Leaf
)
and nodes labeled with grammar rules associated to nonterminal symbols ((
).Brch
)
Other variants of parse trees (TreeSet
, TruncatedTreeSet
) can be represented
using extension nodes (Ext
).
Trees may be infinite due to an input string matching infinitely many parse trees.
Note that even though StrictData
is enabled, we get laziness via the list type []
and tuple type (,)
.
Leaf Int t c | The |
Brch (RuleId n) Int Int [TreeT f n t c] | The |
Ext (f (TreeT f n t c)) |
type TruncatedTree = TreeT (Sum Ellipsis NoExt) Source #
Result of truncateTree
applied to a Tree
.
type TruncatedTreeSet = TreeT (Sum Ellipsis Choice) Source #
Result of truncateTree
applied to a TreeSet
.
Parse tree modifiers
Choice constructor to represent TreeSet
.
a :|: a infixr 1 |
Overloaded version of (
.:|:
)
(|:) :: HasChoice f => TreeT f n t c -> TreeT f n t c -> TreeT f n t c infixr 1 Source #
Construct the disjunction of two trees featuring the Choice
functor.
Ellided by truncateTree
.
Like Sum
from Data.Functor.Sum but with more basic instances
Operations on parse trees
arbTree :: TreeSet n t c -> Tree n t c Source #
Get an arbitrary Tree
from a TreeSet
, even if it is ambiguous.
truncateTree :: Functor f => Int -> TreeT f n t c -> TreeT (Sum Ellipsis f) n t c Source #
Truncate a tree to finite depth.
truncateTree
:: Int ->TreeSet
n t c ->TruncatedTreeSet
n t ctruncateTree
:: Int ->Tree
n t c ->TruncatedTree
n t c
ambiguities :: TreeSet n t c -> [LocAmbiguity n t c] Source #
Enumerate (some) ambiguous parses.
If there are multiple ambiguities at the same location, we just pick an arbitrary example.
An interval in some input sequence.
Evidence of ambiguity: two parse trees for the same input.
type LocAmbiguity n t c = (Range, Ambiguity n t c) Source #
Ambiguity at a given location.
Pretty-printing
class PrettyPrint a where Source #
A class for ad hoc pretty-printers.
Nothing
prettyPrint :: a -> String Source #
default prettyPrint :: Show a => a -> String Source #
Instances
PrettyPrint Char Source # | Display the character without quotes. |
Defined in Little.Earley.Internal.Pretty prettyPrint :: Char -> String Source # | |
Show a => PrettyPrint a Source # | |
Defined in Little.Earley.Internal.PrettyOrphan prettyPrint :: a -> String Source # | |
PrettyPrint String Source # | Display the string without quotes. |
Defined in Little.Earley.Internal.Pretty prettyPrint :: String -> String Source # | |
(PrettyPrint n, PrettyPrint t, PrettyPrint c) => PrettyPrint (Result n t c) Source # | |
Defined in Little.Earley.Internal.Result prettyPrint :: Result n t c -> String Source # |
Wrapper whose Show
instance uses PrettyPrint
.
This provides a convenient and explicit way to display results nicely in the
REPL. See also pparse
.
Pretty a |
drawTree :: (n -> String) -> (c -> String) -> Tree n t c -> [String] Source #
Draw a tree in the terminal.
prettyTree :: (PrettyPrint n, PrettyPrint c) => Tree n t c -> [String] Source #
drawTree
using prettyPrint
to show symbols.