hexpr- A framework for symbolic, homoiconic languages.

Safe HaskellSafe-Inferred




Hexprs are a data structure for representing and storing first-class code objects in a strongly-, statically-typed language.

In an untyped language, first-class code may be represented by simple lists, which may then hold both atomic values as well as other lists. As code objects will need to be manipulated regularly, untyped lists (or [∀a.a], if you prefer) are insufficient in a statically typed language. We therefore introduce Hexprs, which is the type of the extended context-free grammar family S ::= atom | (SS+).

By contrast, the grammar of s-exprs is S ::= atom | (SS*), which also has an associated type. However, s-exprs are less suitable for mathematical reasoning. Note that s-exprs distinguish between a and (a), which is entirely contrary to mathematical notation, wherein superfluous parenthesis may be dropped, and often should be for social reasons. While I'm at it, I may as well say the grammar of roses is S ::= atom | atom(S*).

Because languages with first-class code benefit greatly from quasiquotation, we also introduce the Quasihexpr, which is isomorphic to Hexpr. We give algorithms for encoding quasihexprs into hexprs, see UnQuasihexpr and SimplifyHexpr for more detail. The other direction is considered only useful in theoretical work.

Because we are programming in Haskell, and not Idris, I have decided to leave some invariants out of the type system. The documentation of Hexpr and Quasihexpr give these invariants. It would make pattern matching too cumbersome to encode these invariants, and some would even need extensions. If I were to instead hide the unsafety, it would make pattern matching impossible.


Primary Data Structures

data Hexpr p a Source

Whereas a rose contains an element at every level of organization, elements of a hexpr appear only in the leaves of the structure. That is, internal nodes (branches) are only responsible for grouping consecutive elements and other groups.

Hexprs further disallow trivial branches, where trivial means containing zero of one children. Where there are zero children in a branch, the branch contains no information. Where a branch contains only one node, no extra grouping information is provided. As branches are responsible for grouping, and grouping alone, it does not make sense to allow branches that contain no grouping structure. These restrictions on the number of children in a branch are not currently enforced by the type system, so several functions on hexprs are properly partial.

To aid in production-quality language implementation, we also attach a position to each node. If position is unneeded, simply specialize to ().


Leaf p a 
Branch p [Hexpr p a] 


Hierarchy Hexpr p 
Functor (Hexpr p) 
Openable (Hexpr p) 
Eq a => Eq (Hexpr p a) 

data Quasihexpr p a Source

A Quasihexpr extends Hexpr with quasiquotation.

In addition to the usual restrictions on hexprs, each Unquote and Splice element must be contained within a matching Quasiquote ancestor. Each Quasiquote can match with multiple (or zero) Unquote and Splice nodes, just so long as there is no other Quasiquote between. Again, this restriction is not enforced by the type system.


QLeaf p a 
QBranch p [Quasihexpr p a] 
Quote p (Quasihexpr p a) 
Quasiquote p (Quasihexpr p a) 
Unquote p (Quasihexpr p a) 
Splice p (Quasihexpr p a) 


unQuasihexpr :: UnQuasihexpr a => Quasihexpr p a -> Hexpr p aSource

FIXME stale documentation

Transform a quasihexpr into a hexpr. When the input consists only of QLeaf and QBranch nodes, the transformation is trivial. However, Quote, Quasiquote, Unquote and Splice need to be specially encoded.

Of course, appropriate recursion is also needed, but for that, see the source code. It's interesting, but not helpful for understanding the results if you already understand quasiquotation.

The naive algorithm would usually produce hexprs that are more complex than is necessary. This function factors quotation and quote manipulation to eliminate redundancy.

Assuming that fromQuote does not fail in the transformations, the particular transforms made are as follows. Appropriate recursive searches are made so that no opportunity to simplify is lost.

  • (mkNode c1 ... cn) ---> (mkQuote (s1 ... sn)) where si = fromQuote ci
  • (mkConcat c1 ... cn) ---> (mkQuote cs) where cs = conjoins (fromQuote <$> [c1, ..., cn])

TODO: The following are unimplemented, but shouldn't matter too much. However, ideally the set of Hexprs returned from this function should be a proper subset of Hexprs (i.e. a normal form) that is isomorphic to Quasihexpr. Probably, once the isomorphism is proven, I'll merge this in with unQuasihexpr

  • (nodeForm x1 ... xm) (quoteForm xn) ---> (nodeForm x1 ... xm xn)
  • (quoteForm x0) (nodeForm x1 ... xn) ---> (nodeForm x0 x1 ... xn)
  • Any immediate siblings of nodeForm-lead branches are pushed into the nodeForm just so long as the parent is not a concatForm-lead branch.

class UnQuasihexpr a whereSource


mkNode :: p -> [Quasihexpr p a] -> Quasihexpr p aSource

A node that, when evaluated, creates a single hexpr node from at least one code values.

For example, create the node (nodeForm <e_1> ... <e_n>) with n >= 1, such that then if each e_i reduces to a code value v_i, then the whole node evaluates to (quoteForm (<v_1> ... <v_n>)).

mkList :: p -> [Quasihexpr p a] -> Quasihexpr p aSource

A vararg function that turns a number of values into a list during evaluation.

mkConcat :: p -> [Quasihexpr p a] -> Quasihexpr p aSource

A vararg function that concatenates lists of values into a single node where the lists are obtained by evaluating sibling nodes.

For example, create the node (concatForm <e_1> ... <e_n>) with n >= 1, such that if each e_i reduces to a list of code values vs_i, then the whole form evaluates to (quoteForm (<vs_1> ++ ... ++ <vs_n>)).

isList :: Quasihexpr p a -> BoolSource

fromList :: Quasihexpr p a -> [Quasihexpr p a]Source

removeQuotation :: Quasihexpr p a -> Hexpr p aSource