# A little Earley parser Parser for context-free grammars. ## Example The following module defines a grammar for arithmetic expressions (like `1+2*3`). It defines: - A type of non-terminal symbols `Ns`. - A type of terminal symbols `Ts`. - The production rules of the grammar `arithRules`. - A matching function, which interprets terminal symbols `Ts` as sets of lexemes, of type `Char` here. - A grammar `arithG` packaging up all of the above. ```haskell import Data.Char (isDigit) import Little.Earley data Ns = SUM | PRODUCT | FACTOR | NUMBER deriving (Eq, Ord, Enum, Bounded, Show) data Ts = Digit | OneOf [Char] deriving (Eq, Ord, Show) arithRules :: Ns -> [Rule Ns Ts] arithRules n = case n of SUM -> [ [ N PRODUCT ] , [ N SUM, T (OneOf ['+', '-']), N PRODUCT ] ] PRODUCT -> [ [ N FACTOR ] , [ N PRODUCT, T (OneOf ['*', '/']), N FACTOR ] ] FACTOR -> [ [ N NUMBER ] , [ T (OneOf ['(']), N SUM, T (OneOf [')']) ] ] NUMBER -> [ [ T Digit ] , [ T Digit, N NUMBER ] ] matchTs :: Ts -> Char -> Bool matchTs Digit = isDigit matchTs (OneOf s) = (`elem` s) arithG :: Grammar Ns Ts Char arithG = mkGrammar arithRules matchTs ``` Load that file in GHCi: ``` ghci -package little-earley example.hs ``` Parse a string using that grammar: ```haskell > pparse arithG SUM "1+2*3" ``` Output: ``` +-----+--SUM #1---+ | | | SUM #0 | +PRODUCT #1-+ | | | | | PRODUCT #0 | PRODUCT #0 | | | | | | | FACTOR #0 | FACTOR #0 | FACTOR #0 | | | | | NUMBER #0 | NUMBER #0 | NUMBER #0 | | | | | ----------------------------------- 1 + 2 * 3 ``` The module `Little.Earley.Examples` contains more examples of grammars using this library. ## Links - This Earley parser implementation is based on [Loup Vaillant's tutorial](https://loup-vaillant.fr/tutorials/earley-parsing). - See also the [Earley](https://hackage.haskell.org/package/Earley) library also in Haskell, with much fancier types to encode arbitrary semantic actions instead of clumsy trees.