pinchot-0.2.0.0: Build parsers and ASTs for context-free grammars

Safe HaskellNone
LanguageHaskell2010

Pinchot

Contents

Description

Pinchot provides a simple language that you use to write a Haskell program that describes a context-free grammar. When run, this program creates a value that stores your context-free grammar. You can then use Template Haskell to take this value and generate a series of data types that correspond to your context-free grammar. You can also use Template Haskell to create an Earley parser that will parse all strings in the context-free language.

For examples, please consult Pinchot.Examples.

You should also look at the BNF Converter.

http://bnfc.digitalgrammars.com

Primary differences between BNFC and this library:

  • the BNF Converter works as a standalone binary that parses text BNF files. With Pinchot you specify your grammar in Haskell.
  • the BNF Converter currently generates many more outputs, such as LaTeX. It also generates code for many languages. Pinchot only works in Haskell.
  • the BNF Converter generates input for parser generators like Happy and Bison. Pinchot currently only generates input for the Haskell Earley library.
  • Pinchot integrates seamlessly into Haskell using Template Haskell.
  • the BNF Converter is GPL. Pinchot is BSD3.

Pinchot grows and harvests syntax trees, so it is named after Gifford Pinchot, first chief of the United States Forest Service.

Synopsis

Intervals

data Intervals a Source

Groups of terminals. Create an Intervals using include, exclude, solo and pariah. Combine Intervals using mappend, which will combine both the included and excluded terminal symbols from each operand.

include :: a -> a -> Intervals a Source

Include a range of symbols in the Intervals. For instance, to include the characters a, b, and c, use include a c.

exclude :: a -> a -> Intervals a Source

Exclude a range of symbols in the Intervals. Each symbol that is excluded is not included in the Intervals, even if it is also included.

solo :: a -> Intervals a Source

Include a single symbol.

pariah :: a -> Intervals a Source

Exclude a single symbol.

Simple production rules

data Pinchot t a Source

Constructs new Rules. t is the type of the token; often this will be Char.

Pinchot is a Monad and an Applicative so you can combine computations using the usual methods of those classes. Also, Pinchot is a MonadFix. This allows you to construct a Rule that depends on itself, and to construct sets of Rules that have mutually recursive dependencies. MonadFix also allows you to use the GHC RecursiveDo extension. Put

{-# LANGUAGE RecursiveDo #-}

at the top of your module, then use mdo instead of do. Because an mdo block is recursive, you can use a binding before it is defined, just as you can in a set of let bindings.

type RuleName = String Source

Type synonym for the name of a production rule. This will be the name of the type constructor for the corresponding type in the AST, so this must be a valid Haskell type constructor name.

If you are creating a terminal, option, list, list1, or wrap, the RuleName will also be used for the name of the single data construtor. If you are creating a nonTerminal, you will specify the name of each data constructor with AlternativeName.

type AlternativeName = String Source

Type synonym the the name of an alternative in a nonTerminal. This name must not conflict with any other data constructor, either one specified as an AlternativeName or one that was created using terminal, option, list, or list1.

data Rule t Source

A single production rule. It may be a terminal or a non-terminal.

Instances

Eq t => Eq (Rule t) Source 
Ord t => Ord (Rule t) Source 
Show t => Show (Rule t) Source 

terminal Source

Arguments

:: RuleName 
-> Intervals t

Valid terminal symbols

-> Pinchot t (Rule t) 

Creates a terminal production rule.

terminalSeq Source

Arguments

:: RuleName 
-> [t]

Sequence of terminal symbols to recognize

-> Pinchot t (Rule t) 

Creates a production for a sequence of terminals. Useful for parsing specific words.

nonTerminal Source

Arguments

:: RuleName 
-> [(AlternativeName, [Rule t])]

Alternatives. There must be at least one alternative; otherwise, an error will result. In each pair (a, b), a will be the data constructor, so this must be a valid Haskell data constructor name. b is the sequence of production rules, which can be empty (this is how to create an epsilon production).

-> Pinchot t (Rule t) 

Creates a new non-terminal production rule.

Rules that modify other rules

list Source

Arguments

:: RuleName 
-> Rule t

The resulting Rule is a sequence of productions of this Rule; that is, this Rule may appear zero or more times.

-> Pinchot t (Rule t) 

Creates a rule for the production of a sequence of other rules.

list1 Source

Arguments

:: RuleName 
-> Rule t

The resulting Rule produces this Rule at least once.

-> Pinchot t (Rule t) 

Creates a rule for a production that appears at least once.

option Source

Arguments

:: RuleName 
-> Rule t

The resulting Rule optionally produces this Rule; that is, this Rule may appear once or not at all.

-> Pinchot t (Rule t) 

Creates a rule for a production that optionally produces another rule.

wrap Source

Arguments

:: RuleName 
-> Rule t

The resulting Rule simply wraps this Rule.

-> Pinchot t (Rule t) 

Creates a newtype wrapper.

label :: String -> Rule t -> Rule t Source

Name a Rule for use in error messages. If you do not name a rule using this combinator, the rule's type name will be used in error messages.

(<?>) :: Pinchot t (Rule t) -> String -> Pinchot t (Rule t) infixr 0 Source

Infix form of label for use in a Pinchot; handy for use in do or mdo notation.

Transforming an AST to code

earleyGrammar Source

Arguments

:: Lift t 
=> String

Module prefix. You have to make sure that the data types you created with ruleTreeToCode or with allRulesToCode are in scope, either because they were spliced into the same module that earleyParser is spliced into, or because they are imported into scope. The spliced Template Haskell code has to know where to look for these data types. If you did an unqualified import or if the types are in the same module as is the splice of earleyParser, just pass the empty string here. If you did a qualified import, pass the appropriate namespace here.

For example, if you used import qualified MyAst, pass "MyAst" here. If you used import qualified Data.MyLibrary.MyAst as MyLibrary.MyAst, pass "MyLibrary.MyAst" here.

For an example where the types are in the same module, see Pinchot.Examples.PostalAstRuleTree or Pinchot.Examples.PostalAstAllRules.

For an example using a qualified import, see Pinchot.Examples.QualifiedImport.

-> Pinchot t (Rule t)

Creates an Earley parser for the Rule that the Pinchot returns.

-> Q Exp 

Creates an Earley grammar for a given Rule. For examples of how to use this, see the source code for Pinchot.Examples.PostalAstRuleTree and for Pinchot.Examples.PostalAstAllRules.

allRulesToCode Source

Arguments

:: Name

Terminal type constructor name. Typically you will use the Template Haskell quoting mechanism to get this.

-> [Name]

What to derive. For instance, you might use Eq, Ord, and Show here. Each created data type will derive these instances.

-> Pinchot t a

The return value from the Pinchot is ignored.

-> DecsQ 

Creates code for every Rule created in the Pinchot. The data types are created in the same order in which they were created in the Pinchot. When spliced, the DecsQ is a list of declarations, each of which is an appropriate data or newtype. For an example use of allRulesToCode, see Pinchot.Examples.PostalAstAllRules.

ruleTreeToCode Source

Arguments

:: Name

Terminal type constructor name. Typically you will use the Template Haskell quoting mechanism to get this.

-> [Name]

What to derive. For instance, you might use Eq, Ord, and Show here. Each created data type will derive these instances.

-> Pinchot t (Rule t)

A data type is created for the Rule that the Pinchot returns, and for the ancestors of the Rule.

-> DecsQ 

Creates code only for the Rule returned from the Pinchot, and for its ancestors.