yoko: generic programming with disbanded constructors

[ bsd3, generics, library, reflection ] [ Propose Tags ]

yoko views a nominal datatype as a band of constructors, each a nominal type in its own right. Such datatypes can be disbanded via the disband function into an anonymous sum of nominal constructors, and vice versa via the band function. This library uses extensive type-level programming to enrich its instant-generics foundation with capabilities derived from the constructor-centric perspective.

For example, consider the following nominal datatype.

data Beatles = John ... | Paul ... | George ... | Ringo ...

This type can of course be understood as a sum of the individual constructor types.

data John = John ...
data Paul = Paul ...
data George = George ...
data Ringo = Ringo ...

yoko's conceptual foundations start there. In particular, this allows a constructor, say John, to be used independently of its original range type and sibling constructors.

As a generic programming library, yoko extends intant-generics with support for constructor-centric generic programming. The Examples/LL.hs file distributed with the yoko source demonstrates defining a lambda-lifting conversion between the two types Inner, which has lambdas, and Prog, which has top-level function declarations instead.

data Inner = Lam Type Inner | Var Int | App Inner Inner

data Term = Var Int | App Term Term | DVar Int
data Prog = Prog ([Type], Type, Term) Term

These types are defined in separate modules, since they have constructors with the same name. Indeed, the fact that they having matching constructors named App is crucial for yoko's automatic conversion from Inner's App to Term's App. As written, the generic lambda-lifter would continue to work for any new Inner constructors (e.g. syntax for tuples or mutable references) as long as constructors with the same names and analogous fields were added to Term and the semantics of those constructors doesn't involve binding. This default behavior of the lambda-lifter is specified in about ten lines of user code.

Existing generic libraries don't use constructor names to the degree that yoko does, and so cannot accomodate generic conversions nearly as well.


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1, 0.2, 0.3, 0.3.0.1, 0.3.1, 0.3.1.1, 0.3.1.2, 0.3.1.3, 0.3.2, 0.3.2.1, 0.3.2.2, 0.9, 2.0
Dependencies base (>=4 && <5), tagged (>=0.2 && <0.3), tagged-th (<0.2), type-booleans (<0.2), type-cereal (<0.2), type-digits (<0.2), type-equality (<0.2), type-ord (<0.2), type-ord-spine-cereal (<0.2), type-spine (<0.2) [details]
License BSD-3-Clause
Author Nicolas Frisby <nicolas.frisby@gmail.com>
Maintainer Nicolas Frisby <nicolas.frisby@gmail.com>
Category Generics, Reflection
Uploaded by NicolasFrisby at 2011-11-27T01:27:40Z
Distributions
Reverse Dependencies 1 direct, 0 indirect [details]
Downloads 8059 total (34 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]

Readme for yoko-0.1

[back to package description]
drex (read "dee-recks")

While the "d" is just for "datatype", "rex" is bit of a double entendre.

The representation classes (DT, Generic, Gist) each disinvest their parameter
of its nominality, similar to how a king disinvests nobles of their titles.

Each class also dismantles, or "wrecks", its parameter: DT takes a single type
to a sum of its constructors, Generic maps a constructor to its underlying
shape, and Gist forgets the shape and mediation.

... Also, it sounds like "T-rex".

===================

See http://j.mp/tNLx40 for a Google spreadsheet cataloging the various d-rex
concepts and components.

====================

#1 Basic Universes

The @:::@ class in the @Universe@ module is pervasive in d-rex. The constraint
@t ::: u@ is read "t inhabits u" (or "t satisfies u", if you must). @u@ is a
/universe/, a type that represents a possibly finite, possibly paradoxical
collection of types. Universes can be /open/ or /closed/. @Lit@, for example,
is closed.

  in module Ex
    data Lit t where IntLit :: Lit Int; CharLit :: Lit Char

@ShowD@ is open, since new instances of @Show@ can be declared anywhere.

  in module Ex
    data ShowD t where ShowD :: Show t => ShowD t

(The "D" suffix is for "dictionary", since this GADT operationally reifies the
@Show@ dictionary. @(\ShowD x -> show x) :: ShowD t -> t -> String@ -- note
that there's no @Show t@ constraint in that type.

Some closed universes are also finite. There exists an isomorphism between such
a universes and a finite set of types (#4 below).

#2 Constructor Universes

d-rex's principle novelty is its support of the finite closed universe of a
datatype's constructors, codifed as the indexed data family @DCU@. The @open@
method of the @DT@ type class converts from a type to its universe of
constructors. @close@ goes back the other way.

As d-rex breaks a datatype into its universe of constructors, it also generates
a new void type per constructor. For example, d-rex breaks @Either a b@ into

  in module G
    data Left a b; data Right a b

With these types, d-rex declares the constructor universe of @Either a b@.

  in module G
    data instance DCU (Either a b) where
      Left_  :: DCU (Either a b) (Left  a b)
      Right_ :: DCU (Either a b) (Right a b)

Note that each constructor type inherits the original type's parameters. d-rex
also declares an instance of the data family @RM@ for each constructor -- the
resulting types are called /fields types/.

  in module G
    newtype instance RM (N (Left  a b)) m = Left a
    newtype instance RM (N (Right a b)) m = Right b

  in module ReflectBaseR
    type Fields dc = RM (N dc)

(Clearly, d-rex re-uses the constructor names. Hence, the generic declarations
must always generated in a separate module to enable namespace management.)

The data family @RM@, the @m@ parameter, type family @App@, and @N@ are
explained in the next section. In the interim, we'll make do with a couple
brief declarations.

  in module Type
    data IdT; type instance App IdT a = a

There now exists an isomorphism between @Either a b@ and
@forall dc. (DCU (Either a b) dc, Fields dc IdT).

  Left x    =~= (Left_, G.Left x)
  Right x   =~= (Right_, G.Right x)

The @DCU@ tag is a crucial part of this pair -- without it, G.Left and G.Right
would have inequal types!

#3 Recursion-mediated types

The data family @RM@ stands for "recursion-mediated". The idea is that the @m@
parameter is applied to all recursive type occurences in a constructor's
fields.

  in module T
    data Even a = Zero | Even a (Odd a)
    data Odd  a = Odd a (Even a) 

  in module G
    data Zero a; data Even a; data Odd a
    data instance RM (N (Zero   a)) m
    data instance RM (N (G.Even a)) m = Even a (App m (T.Odd  a))
    data instance RM (N (G.Odd  a)) m = Odd  a (App m (T.Even a))

  in module Type
    data True = True; data False = False

  in module Ex
    data ParityM
    type instance App ParityM (T.Even a) = False
    type instance App ParityM (T.Odd  a) = True

    ex0 = Even 'e' True :: Fields (G.Even Char) ParityM
    ex1 = Odd 'o' False :: Fields (G.Odd  Char) ParityM

The recursion-mediated representation of the fields types enables their
re-use. For example, the same fields type can be used to define a bottom-up
reducer, where the recursive occurrences have been replaced with the result of
the catamorphism.

  in module Ex
    data LengthM
    type instance App LengthM (T.Even a) = Int
    type instance App LengthM (T.Odd  a) = Int

    type Reducer m dc = Fields dc m -> App m dc

    ex2 :: Reducer Len (G.Even a)
    ex2 (G.Even _ i) = 1 + i 

Since @App@ is a type family, it's not necessarily injective. @LengthM@
demonstrates where injectivity would not be desirable. Indeed, d-rex relies on
this as discussed in #6 below. Unfortunately, non-injectivity can muddle type
inference. For example, the inferred type of @G.Even 'c' 3@ involves a type
variable: @(Num i, App m (T.Even Char) ~ i) => Fields (G.Even Char) m@. We
provide the function @mediated@ for directly specifying the mediator.

  in module Util
    mediated :: [qP|m|] -> RM c m -> RM c m
    mediated = const id

  in module Ex
    -- inferred ex3 :: Fields (G.Even Char) LengthM
    ex3 = mediated [qP|LengthM|] $ G.Even 'c' 3

(@qP@ is just a quasiquoter for proxies -- useful for passing types as values.)

The data family @RM@ is indexed by the core representational types. Most of
these are common to many representation-based generic programming
libraries. They indeed compromise a closed universe @Core@; that particular
universe per se is not codified in d-rex, but its closedness is the crux of all
representational generic programming.

  in module Core
    type family Rep a
    data V                            -- void
    data U = U                        -- unit
    newtype D a = D a                 -- a dependency
    newtype R t = R t                 -- a recursive occurrence
    newtype F f c = F (f c)           -- argument to a *->*
    newtype FF ff c d = FF (ff c d)   -- arguments to a *->*->*
    newtype M i c = M c               -- meta information

    newtype N t = N t                 -- a named type (user hook)

    type (:+) = FF Either
    type (:*) = FF (,)
    type (:->) = FF (->)

The structure of many constructors' fields, like T.Even can be codified in
terms of these basic types.

  in module G
    type Rep (G.Even a) = FF (,) (D a) (R a)

The recursion-mediated types are indexed by these core types. Note that the
following declarations are in a separate module, so the constructor names don't
actually clash.

  in module GenericR
    data family RM c m
    data instance RM V m
    data instance RM U m = U
    newtype instance RM (D a) m = D a
    newtype instance RM (R t) m = R (App m t)
    newtype instance RM (F f c) m = F (f (RM c m))
    newtype instance RM (FF ff c d) m = FF (ff (RM c m) (RM d m))
    newtype instance RM (M i c) m = M (RM c m)

These just follow the semantics of recusiion-mediated types.

The only core type without an @RM@ instance is @N@. @N@ is crucial to d-rex's
usability. It is the interface boundary between the d-rex kernel and the user
datatype. As demonstrated earlier in this section and in section #2, @RM (N -)@
instances are provided for each fields type. There is also a corresponding
instance of @Rep@ and @Generic@.

  in module GenericR
    class Generic a where
      rep :: RM (N a) m -> RM (Rep a) m
      obj :: RM (Rep a) m -> RM (N a) m

  in module G
    instance Generic (G.Even a) where
      rep ~(G.Even a o) = FF (D a, R o)
      obj ~(FF (D a, R o))) = G.Even a o

INVARIANT: the RHS of a Rep should never include an @N@. @N@ is just in place
to delay the representation of a type.

#4 Sets of types

d-rex also uses a universe of types constructed via V, :+, and N to represent
finite sets (implemented as type-level binary trees) of types. The @Finite@
type class recognizes the isomorphism between a finite closed universes and a
set of types.

  in module TypeBTree
    type family Inhabitants u
    class Finite u where
      path :: u t -> Small (Inhabitants u) t
      tag  :: Small (Inhabitants u) t -> u t

  in module Ex
    type instance Inhabitants (DCU Lit) = N Int :+ N Char

    type instance Inhabitants (DCU (Either a b)) =
      N (G.Left a b) :+ N (G.Right a b)

    type instance Inhabitants (DCU (T.Even a)) =
      N (G.Zero a) :+ N (G.Even a)

@Inhabitants@/@Finite@ recognizes @V@, @:+@, and @N@ as the closed
representational core of finite closed universes in the same way that
@Rep@/@Generic@ encode isomorphisms between the full ensemble of core types and
the large set of Haskell types they can represent.

#5 Other Universes

Exists, Small, All, MFun, MMap ...

#6 Tag-Gist equivalence and Conversions

...