% Canonical LF using unbound-generics % Aleksey Kliger % June 2016 Canonical LF ============ This is a representation of [LF](http://www.twelf.org/) in which all terms are automatically in *canonical form*. The key idea is to segregate the type families and the terms into *atomic* and *normal* forms where the term variables only stand for atomic terms, and not arbitrary ones. Then, a substitution procedure is defined that takes terms in normal form and performs a substitution for a variable while simultaneously normalizing any redices that occur. > {-# LANGUAGE DeriveGeneric, StandaloneDeriving, DeriveDataTypeable, > ViewPatterns, RankNTypes, FlexibleContexts, FlexibleInstances, > FunctionalDependencies, TypeFamilies > #-} > module CanonicalLF where > import Unbound.Generics.LocallyNameless > import GHC.Generics (Generic) > import Data.Typeable (Typeable) > import qualified Data.Map as M > import Control.Monad.Reader > import Control.Monad.Except > import Data.Functor.Identity > import Control.Applicative (Const(..)) Syntax ------ An LF signature introduces type family atoms and constant terms. > data Signature = NilS > | SnocAtom (Rebind Signature (Atm, Embed Kind)) > | SnocConst (Rebind Signature (Cnst, Embed Type)) > deriving (Show, Generic, Typeable) The type families are classified by kinds and may either be plain types, or pi-kinds for families of types indexed by a term variable. > data Kind = TypeK | PiK (Bind (Var, Embed Type) Kind) > deriving (Show, Generic, Typeable) The atomic type families are either type familiy atoms applied to zero or more terms in normal form. > type Atm = Name P > data P = AtmP Atm | AppP P Term > deriving (Show, Generic, Typeable) Type families in normal form are either atomic type families or dependent product types indexed by a term variable of normal type. > data Type = PT P | PiT (Bind (Var, Embed Type) Type) > deriving (Show, Generic, Typeable) The atomic terms are either variables or constants applied to zero or more terms in normal form. > type Cnst = Name R > type Var = Name R > data R = VarR Var | ConstR Cnst | AppR R Term > deriving (Show, Generic, Typeable) A term in normal form is either an atomic term or a lambda abstraction that binds a term variable. > data Term = RM R | LamM (Bind Var Term) > deriving (Show, Generic, Typeable) When typechecking kinds, types or terms, new term variables may come into scope. They are collected in contexts. > data Context = NilC > | Snoc (Rebind Context (Var, Embed Type)) > deriving (Show, Generic, Typeable) All the syntactic objects are equivalent upto renaming of bound variables. > instance Alpha Signature > instance Alpha Kind > instance Alpha P > instance Alpha Type > instance Alpha R > instance Alpha Term > instance Alpha Context The metatheory of Canonical LF uses simple types to prove the termination of hereditary substitution (defined below). But they aren't needed in the implementation. (Although it would be interesting to lift them to Haskell kinds and index the terms by the simple types to disallow some malformed terms.) > data SimpleType = AtmS Atm | ArrS SimpleType SimpleType > deriving (Show, Generic, Typeable) > > instance Alpha SimpleType Hereditary Substitution ----------------------- Variables in Canonical LF stand for atomic terms, but we will need to subtitute terms for them. If we used ordinary capture-avoiding substitution, such substitution would produce redices, which we are precisely what we don't want. However redices will potentially only occur when the variable for which we're substituting occurs at the head of an atomic term. > isHeadVarR :: Var -> R -> Bool > isHeadVarR x (VarR y) = x == y > isHeadVarR _ (ConstR _) = False > isHeadVarR x (AppR r _) = isHeadVarR x r Just using a boolean to decide if the variable is at the head of an atomic term is fine, but we can actually partition an atomic term into its head variable or constant together with a spine of applications. > data Spine a = NilSp a | AppSp (Spine a) Term > data Head = VarH Var | ConstH Cnst If the variable for which we'll be substituting is at the head we only really care about the spine. Otherwise we have some other variable, or perhaps a constant at the head. > headSpine :: Var -> R -> Either (Spine Head) (Spine ()) > headSpine x (VarR y) | x == y = Right (NilSp ()) > | otherwise = Left (NilSp (VarH y)) > headSpine _ (ConstR c) = Left (NilSp (ConstH c)) > headSpine x (AppR r m) = case headSpine x r of > Left s -> Left (AppSp s m) > Right s -> Right (AppSp s m) Substitution in a kind just carries out substitution in the types. Likewise substutition in normal type families. > substKind :: Fresh m => Term -> Var -> Kind -> m Kind > substKind _ _ TypeK = return TypeK > substKind m x (PiK bnd) = do > ((y, unembed -> a), k) <- unbind bnd > a' <- substType m x a > k' <- substKind m x k > return $ PiK $ bind (y, embed a') k' > > substType :: Fresh m => Term -> Var -> Type -> m Type > substType m x (PT p) = do > p' <- substP m x p > return (PT p') > substType m x (PiT bnd) = do > ((y, unembed -> a), b) <- unbind bnd > a' <- substType m x a > b' <- substType m x b > return $ PiT $ bind (y, embed a') b' Atomic type family application substitutes a term for a variable in the (normal) index terms. > substP :: Fresh m => Term -> Var -> P -> m P > substP _ _ (AtmP a) = return (AtmP a) > substP m x (AppP p n) = do > p' <- substP m x p > n' <- substTerm m x n > return (AppP p' n') Normal term substitution goes under a lambda (freshness is ensured by the library) and into the atomic term. > substTerm :: Fresh m => Term -> Var -> Term -> m Term > substTerm m x (RM r) = substR m x r > substTerm m x (LamM bnd) = do > (y, n) <- unbind bnd > n' <- substTerm m x n > return $ LamM $ bind y n' To substitute in an atomic term we separate the head and the spine and proceed according to whether the variable at the head is the one we are substituting for. > substR :: Fresh m => Term -> Var -> R -> m Term > substR m x r = > case headSpine x r of > Left rsp -> RM <$> substRRsp m x rsp > Right msp -> substMsp m x msp If there is another variable or a constant at the head, we will get some kind of atomic term out since the head is unchanged and we only substitute into the index terms. > substRRsp :: Fresh m => Term -> Var -> Spine Head -> m R > substRRsp _ _ (NilSp h) = return (headR h) > substRRsp m x (AppSp sp n) = do > n' <- substTerm m x n > r <- substRRsp m x sp > return $ AppR r n' > > headR :: Head -> R > headR (VarH y) = VarR y > headR (ConstH c) = ConstR c When the variable we care about is at the head, we apply the substitution to the rest of the spine to get a normal term (which, by the metatheory will turn out to be some kind of a lambda), apply the substitution to the index, and then carry out a new substitution of the new index for the variable bound by the lambda to the body of the lambda. This last step is the heredetary part of heredetary substitution. The metatheory guarantees that this process will terminate. (Because the simple type of the body of the lambda is smaller than the simple type of the original term). > substMsp :: Fresh m => Term -> Var -> Spine () -> m Term > substMsp m _ (NilSp ()) = return m > substMsp m x (AppSp s n) = do > o_ <- substMsp m x s > n' <- substTerm m x n > case o_ of > LamM bnd -> do > (y, o) <- unbind bnd > substTerm n' y o > _ -> error "can't happen" Typechecking ============ We typecheck in an environment that maps type family atoms, term costants and variables to their respective kinds and types. > data Env = Env { _envAtm :: M.Map Atm Kind, > _envConst :: M.Map Cnst Type, > _envCtx :: M.Map Var Type } > emptyEnv :: Env > emptyEnv = Env M.empty M.empty M.empty Some lenses to work with the environment > envAtm :: Lens' Env (M.Map Atm Kind) > envAtm afb s = fmap (\atms -> s { _envAtm = atms }) $ afb (_envAtm s) > > envCtx :: Lens' Env (M.Map Var Type) > envCtx afb s = fmap (\ctx -> s { _envCtx = ctx } ) $ afb (_envCtx s) > > envConst :: Lens' Env (M.Map Cnst Type) > envConst afb s = fmap (\sig -> s { _envConst = sig} ) $ afb (_envConst s) And some combinators to perform lookups. > lookupOver :: (MonadReader e m, MonadError String m) => Getting p e p -> String -> (p -> Maybe c) -> m c > lookupOver l s f = do > mk <- views l f > case mk of > Nothing -> throwError $ "unbound " ++ s > Just c -> return c > > lookupAtom :: (MonadReader Env m, MonadError String m) => Atm -> m Kind > lookupAtom = lookupOver envAtm "atom" . M.lookup > > lookupVar :: (MonadReader Env m, MonadError String m) => Var -> m Type > lookupVar = lookupOver envCtx "variable" . M.lookup > > lookupConst :: (MonadReader Env m, MonadError String m) => Cnst -> m Type > lookupConst = lookupOver envConst "constant" . M.lookup To check a signature we check the kind or type classifying the atom or constant and then continue in an environment extended with the new binding. > withSigOk :: (Fresh m, MonadReader Env m, MonadError String m) => Signature -> m r -> m r > withSigOk NilS kont = kont > withSigOk (SnocAtom (unrebind -> (s, (a, unembed -> k)))) kont = > withSigOk s $ do > local (set envCtx M.empty) $ wfk k > local (over envAtm (M.insert a k)) kont > withSigOk (SnocConst (unrebind -> (s, (c, unembed -> a)))) kont = > withSigOk s $ do > local (set envCtx M.empty) $ wfType a > local (over envConst (M.insert c a)) kont Kind and normal type formation is unsurprising. Note that PT is only well-formed when the atomic type family is fully applied and is of base kind. > wfk :: (Fresh m, MonadReader Env m, MonadError String m) => Kind -> m () > wfk TypeK = return () > wfk (PiK bnd) = do > ((x, unembed -> t), k) <- unbind bnd > wfType t > local (over envCtx (M.insert x t)) $ wfk k > > wfType :: (Fresh m, MonadReader Env m, MonadError String m) => Type -> m () > wfType (PT p) = do > k <- inferP p > case k of > TypeK -> return () > _ -> throwError "expected a type" > wfType (PiT bnd) = do > ((x, unembed -> a), b) <- unbind bnd > wfType a > local (over envCtx (M.insert x a)) $ wfType b For atomic type families we infer their kinds. For atoms we read the kind off from the environment. For applications we infer the kind of the type family (which had better be a pi kind), and then check that the term argument has the expected type and then return the resulting kind where we (heredeterily) substitute the term for the index variable. > inferP :: (Fresh m, MonadReader Env m, MonadError String m) => P -> m Kind > inferP (AtmP atm) = lookupAtom atm > inferP (AppP p m) = do > k <- inferP p > case k of > TypeK -> throwError "expected a pi kind" > PiK bnd -> do > ((x, unembed -> a), k') <- unbind bnd > checkTerm m a > substKind m x k' To check that a term in normal form has the expected normal type, we check that its either a lambda of pi type, or an atomic term of atomic type. The latter ensures that terms are in eta long form by requiring all variables and constants to be fully applied. We infer the type of an atomic term (which had better be atomic) and then check that it is alpha-equivalent to the given atomic type. Because the calculus is constructed to only allow terms in normal form, alpha equivalence suffices and we don't have to do any normalization. (We paid that price in heredetary substitution.) > checkTerm :: (Fresh m, MonadReader Env m, MonadError String m) => Term -> Type -> m () > checkTerm (LamM bnd) (PiT bnd') = do > mmatch <- unbind2 bnd bnd' > case mmatch of > Just (x, m, (_, unembed -> a), b) -> do > wfType a > local (over envCtx (M.insert x a)) $ checkTerm m b > Nothing -> throwError "did not match" > checkTerm (RM r) (PT p) = do > t <- inferTerm r > case t of > (PT p') | p `aeq` p' -> return () > _ -> throwError "atomic term doesn't have the expected atomic type." > checkTerm (LamM {}) _ = throwError "lambda with no-PI type" > checkTerm (RM {}) _ = throwError "atomic term with non-atomic type" To infer the type of a term, we lookup variables and constants in the environment. For applications we ensure that the head has some kind of pi type and then check the index against the argument type, and then return the result type with the argument substituted for the index variable. > inferTerm :: (Fresh m, MonadReader Env m, MonadError String m) => R -> m Type > inferTerm (VarR x) = lookupVar x > inferTerm (ConstR c) = lookupConst c > inferTerm (AppR r m) = do > p_ <- inferTerm r > case p_ of > PiT bnd -> do > ((x, unembed -> a), b) <- unbind bnd > checkTerm m a > substType m x b > PT {} -> throwError "expected a function in application position" Smart Constructors ================== This section defines a little DSL for writing Canonical LF terms in Haskell. It's a higher-order encoding that uses haskell variable binding to represent LF binding constructs. We use a higher-kinded repr parameter to allow for different sorts of interpretations for this DSL. Although in this example we only build one using the Syn newtype to wrap a fresh-name monad computation that just builds a term in our original AST, above. > class TermSyntax repr r n | r -> n, n -> r where > lam :: String -> ((repr r) -> (repr n)) -> (repr n) > app :: repr r -> repr n -> repr r > rm :: repr r -> repr n > > newtype Syn m a = Syn { unSyn :: m a } > > instance Fresh m => TermSyntax (Syn m) R Term where > lam hint f = Syn $ do > x <- fresh (s2n hint) > m <- unSyn $ f (Syn $ return $ VarR x) > return $ LamM (bind x m) > app r n = Syn (AppR <$> unSyn r <*> unSyn n) > rm r = Syn (RM <$> unSyn r) > class TypeSyntax repr p a | a -> p, p -> a where > type TermInType repr a :: * > type RInType repr a :: * > piT :: String -> repr a -> (repr (RInType repr a) -> repr a) -> repr a > arrT :: repr a -> repr a -> repr a > arrT a b = piT "_" a (const b) > appP :: repr p -> repr (TermInType repr a) -> repr p > pt :: repr p -> repr a > instance Fresh m => TypeSyntax (Syn m) P Type where > type TermInType (Syn m) Type = Term > type RInType (Syn m) Type = R > piT hint sa f = Syn $ do > x <- fresh (s2n hint) > b <- unSyn $ f (Syn $ return $ VarR x) > a <- unSyn sa > return $ PiT $ bind (x, embed a) b > appP p m = Syn (AppP <$> unSyn p <*> unSyn m) > pt p = Syn (PT <$> unSyn p) > class KindSyntax repr k where > type TypeInKind repr k :: * > type RInKind repr k :: * > typeK :: repr k > piK :: String -> repr (TypeInKind repr k) -> (repr (RInKind repr k) -> repr k) -> repr k > arrK :: repr (TypeInKind repr k) -> repr k -> repr k > arrK a k = piK "_" a (const k) > instance Fresh m => KindSyntax (Syn m) Kind where > type TypeInKind (Syn m) Kind = Type > type RInKind (Syn m) Kind = R > typeK = Syn $ return TypeK > piK hint sa sk = Syn $ do > x <- fresh (s2n hint) > a <- unSyn sa > k <- unSyn $ sk (Syn $ return $ VarR x) > return $ PiK $ bind (x, embed a) k > class SignatureSyntax repr sig p r | sig -> p r where > type KindInSig repr sig :: * > type TypeInSig repr sig :: * > letAtom :: String -> repr (KindInSig repr sig) -> (repr p -> repr sig) -> repr sig > letConstant :: String -> repr (TypeInSig repr sig) -> (repr r -> repr sig) -> repr sig > endSig :: repr sig > > instance Fresh m => SignatureSyntax (Syn m) (Signature -> Signature) P R where > type KindInSig (Syn m) (Signature -> Signature) = Kind > type TypeInSig (Syn m) (Signature -> Signature) = Type > > letAtom hint sk kont = Syn $ do > a <- fresh (s2n hint) > k <- unSyn sk > f <- unSyn $ kont $ Syn $ return $ AtmP a > return $ \sig -> f (SnocAtom $ rebind sig (a, embed k)) > > letConstant hinst st kont = Syn $ do > c <- fresh (s2n hinst) > t <- unSyn st > f <- unSyn $ kont $ Syn $ return $ ConstR c > return $ \sig -> f (SnocConst $ rebind sig (c, embed t)) > > endSig = Syn $ return id > infixr 6 `arrT`, `arrK` > infixl 6 `appP`, `app` Example ------- An LF signature fragment for first-order logic. > example1 :: Fresh m => Syn m (Signature -> Signature) > example1 = > letAtom "o" typeK $ \o -> > letConstant "tt" (pt o) $ \tt -> > letConstant "ff" (pt o) $ \ff -> > letConstant "not" (pt o `arrT` pt o) $ \not -> > letConstant "and" (pt o `arrT` pt o `arrT` pt o) $ \and -> > letAtom "nd" (pt o `arrK` typeK) $ \nd -> > letConstant "tti" (pt (nd `appP` rm tt)) $ \tti -> > letConstant "ffe" (piT "a" (pt o) $ \a -> > pt (nd `appP` rm ff) > `arrT` > pt (nd `appP` rm a)) $ \ffe -> > letConstant "noti" (piT "a" (pt o) $ \a -> > (piT "p" (pt o) $ \p -> > pt (nd `appP` rm a) > `arrT` > pt (nd `appP` rm p)) > `arrT` > pt (nd `appP` rm (not `app` rm a))) $ \noti -> > letConstant "note" (piT "a" (pt o) $ \a -> piT "c" (pt o) $ \c -> > pt (nd `appP` rm (not `app` rm a)) > `arrT` > pt (nd `appP` rm a) > `arrT` > pt (nd `appP` rm c)) $ \note -> > letConstant "andi" (piT "a" (pt o) $ \a -> piT "b" (pt o) $ \b -> > pt (nd `appP` rm a) > `arrT` > pt (nd `appP` rm b) > `arrT` > pt (nd `appP` rm (and `app` rm a `app` rm b))) $ \andi -> > letConstant "ande1" (piT "a" (pt o) $ \a -> piT "b" (pt o) $ \b -> > pt (nd `appP` rm (and `app` rm a `app` rm b)) > `arrT` > pt (nd `appP` rm a)) $ \ande1 -> > letConstant "ande2" (piT "a" (pt o) $ \a -> piT "b" (pt o) $ \b -> > pt (nd `appP` rm (and `app` rm a `app` rm b)) > `arrT` > pt (nd `appP` rm b)) $ \ande2 -> > endSig > checkSynSig :: (Fresh m, MonadReader Env m, MonadError String m) => Syn m (Signature -> Signature) -> m () > checkSynSig sig = do > sigf <- unSyn sig > withSigOk (sigf NilS) $ return () Example: ```haskell >>> runExceptT $ runFreshMT $ runReaderT (checkSynSig example1) emptyEnv Right () ``` Appendix: Lens utilities ============== Some machinery to work with records. > type Lens s t a b = forall f . Functor f => (a -> f b) -> s -> f t > type Lens' s a = Lens s s a a > type Setting s t a b = (a -> Identity b) -> s -> Identity t > type Setting' s a = Setting s s a a > over :: Setting s t a b -> (a -> b) -> s -> t > over l f = runIdentity . l (Identity . f) > set :: Setting s t a b -> b -> s -> t > set l = over l . const > type Getting r s a = (a -> Const r a) -> s -> Const r s > views :: MonadReader s m => Getting a s a -> (a -> r) -> m r > views l f = asks (\s -> f (getConst (l Const s)))