nanopass-0.0.2.0: An EDSL for creating compilers using small passes and many intermediate representations.
Safe HaskellSafe-Inferred
LanguageHaskell2010

Language.Nanopass

Description

Nanopass consists essentially of creating languages and defining passes. Languages can be created from scratch or by derivation using deflang. The tedious parts of a compiler pass (or at least, most passes) can be generated with defpass.

More details and examples are given in the readme.

Synopsis

Documentation

deflang :: QuasiQuoter Source #

Define a language, either from scratch or by derivation from an existing language. The syntax is based on s-expressions. Whitespace doesn't matter, and a (full) line can be commented out with a hash (#). More details and examples are given in the readme.

We embed the syntax of the quasiquoters in a modified form of sexprs which allow---and distinguish between---square and curly brackets alongside round brackets. Atoms are just sequences of characters that don't contain whitespace, though we only recognize a handful of these as valid syntactically. Importantly, we treat symbols differently based on their shape:

  • UpCamelCase is used as in normal Haskell: to identify constructors, both type- and data-
  • $Name is used for recursive references to syntactic categories
  • lowerCamel is used for language parameters and the names of terms
  • DotSeparated.UpCamelCase is used to qualify the names of languages and types.
  • a handful of operators are used

Since the syntax is based on s-expressions, we use Scheme's entry format conventions for describing the syntax. Importantly, we syntactic variables are enclosed in ⟨angle brackets⟩, and ellipsis ⟨thing⟩… indicate zero or more repetitions of ⟨thing⟩. Round, square, and curly brackets, as well as question mark, asterisk, and so on have no special meaning: they only denote themselves.

 langdef
   ::= ⟨language definition⟩
    |  ⟨language modification⟩
 
 language definition
   ::= ⟨UpName⟩ ( ⟨lowName⟩… ) ⟨syntactic category⟩…
   ::= ⟨UpName⟩ ⟨syntactic category⟩…
 
 language modification
   ::= ⟨Up.Name⟩ :-> ⟨UpName⟩ ( ⟨lowName⟩… ) ⟨syntactic category modifier⟩…
    |  ⟨Up.Name⟩ :-> ⟨UpName⟩ ⟨syntactic category modifier⟩…
 
 syntactic category ::= ( ⟨UpName⟩ ⟨production⟩… )
 production ::= ( ⟨UpName⟩ ⟨subterm⟩… )
 subterm
   ::= { ⟨lowName⟩ ⟨type⟩ }
    |  ⟨type⟩
 
 type
   ::= $⟨UpName⟩                               # reference a syntactic category
    |  ⟨lowName⟩                               # type parameter
    |  ( ⟨Up.Name⟩ ⟨type⟩… )                   # apply a Haskell Type constructor to arguments
    |  ⟨Up.Name⟩                               # same as: (⟨Up.Name⟩)
    |  ( ⟨type⟩ ⟨type operator⟩… )             # apply common type operators (left-associative)
    |  ( ⟨Up.Name⟩ ⟨type⟩… ⟨type operator⟩… )  # same as: ((⟨UpName⟩ ⟨type⟩…) ⟨type operator⟩…)
    |  { ⟨type⟩ ⟨type⟩ ⟨type⟩… }               # tuple type
    |  [ ⟨type⟩ :-> ⟨type⟩ ]                   # association list: ({⟨type⟩ ⟨type⟩} *)
    |  { ⟨type⟩ :-> ⟨type⟩ }                   # Data.Map
 
 type operator
   ::= *  # []
    |  +  # NonEmpty
    |  ?  # Maybe

defpass :: QuasiQuoter Source #

Define automatic translation between two langauges. This creates an Xlate type and the descend<Syntactic Category> family of functions, as well as pure variants (XlateI and descend<Syntactic Category>I) and a lifting function idXlate. A translation function is generated for each syntactic category with the same name in both source and target languages. At the moment, there is no provision for altering the name of the type or translation function(s), but I expect you'll only want to define one translation per module.

The Xlate type takes all the parameters from both languages (de-duplicating parameters of the same name), as well as an additional type parameter, which is the functor f under which the translation occurs.

The type of a descend<Syntactic Category> function is Xlate f → σ → f σ'.

If a production in the source language has subterms τ₁ … τₙ and is part of the syntactic category σ, then a hole member is a function of type τ₁ → … τₙ → f σ', where σ' is the corresponding syntactic category in the target language. Essentially, you get access all the subterms, and can use the Applicative to generate a target term as long as you don't cross syntactic categories.

If a source language has syntactic category σ with the same name as the target's syntactic category σ', then an override member is a function of type σ → Maybe (f σ'). If an override returns Nothing, then the automatic translation will be used, otherwise the automatic translation is ignored in favor of the result under the Just.

The pure variants have the same form as the Applicative ones, but:

  • XlateI is not parameterized by f, nor are the types of its members,
  • the members of XlateI are suffixed with the letter I, and
  • the types of the descend<Syntactic Category>I functions are not parameterzed by f.

The idXlate function is used by Nanopass to translate XlateI values into Xlate values. This is done so that the same code paths can be used for both pure and Applicative translations. Under the hood, this is done with appropriate wrapping/unwrapping of Identity, which is a no-op.

None of the functions defined by this quasiquoter need to be expoted for Nanopass to function. I expect you will not export any of these definitions directly, but instead wrap them into a complete pass, and only export that pass.

More details and examples are given in the readme.

The syntax is:

 ⟨Up.Name⟩ :-> ⟨Up.Name⟩