% $Id: Typing.lhs,v 1.7 2004/02/12 19:13:12 wlux Exp $
%
% Copyright (c) 2003-2006, Wolfgang Lux
% See LICENSE for the full license.
%
\nwfilename{Typing.lhs}
\section{Computing the Type of Curry Expressions}
\begin{verbatim}

> module Typing(Typeable(..)) where

> import Data.Maybe
> import Control.Monad

> import Base
> import TypeSubst
> import Combined
> import TopEnv
> import Utils


\end{verbatim}
During the transformation of Curry source code into the intermediate
language, the compiler has to recompute the types of expressions. This
is simpler than type checking because the types of all variables are
known. Yet, the compiler still must handle functions and constructors
with polymorphic types and instantiate their type schemes using fresh
type variables. Since all types computed by \texttt{typeOf} are
monomorphic, we can use type variables with non-negative offsets for
the instantiation of type schemes here without risk of name conflicts.
Using non-negative offsets also makes it easy to distinguish these
fresh variables from free type variables introduce during type
inference, which must be regarded as constants here.

However, using non-negative offsets for fresh type variables gives
rise to two problems when those types are entered back into the type
environment, e.g., while introducing auxiliary variables during
desugaring. The first is that those type variables now appear to be
universally quantified variables, but with indices greater than the
number of quantified type variables.\footnote{To be precise, this can
  happen only for auxiliary variables, which have monomorphic types,
  whereas auxiliary functions will be assigned polymorphic types and
  these type variables will be properly quantified. However, in this
  case the assigned types may be too general.} This results in an
internal error (``Prelude.!!: index too large'') whenever such a type
is instantiated. The second problem is that there may be inadvertent
name captures because \texttt{computeType} always uses indices
starting at 0 for the fresh type variables. In order to avoid these
problems, \texttt{computeType} renames all type variables with
non-negative offsets after the final type has been computed, using
negative indices below the one with the smallest value occurring in
the type environment. Computing the minimum index of all type
variables in the type environment seems prohibitively inefficient.
However, recall that, thanks to laziness, the minimum is computed only
when the final type contains any type variables with non-negative
indices. This happens, for instance, 36 times while compiling the
prelude (for 159 evaluated applications of \texttt{typeOf}) and only
twice while compiling the standard \texttt{IO} module (for 21
applications of \texttt{typeOf}).\footnote{These numbers were obtained
  for version 0.9.9.}

A careful reader will note that inadvertent name captures are still
possible if one computes the types of two or more auxiliary variables
before actually entering their types into the environment. Therefore,
make sure that you enter the types of these auxiliary variables
immediately into the type environment, unless you are sure that those
types cannot contain fresh type variables. One such case are the free
variables of a goal.

\ToDo{In the long run, this module should be made obsolete by adding
attributes to the abstract syntax tree -- e.g., along the lines of
Chap.~6 in~\cite{PeytonJonesLester92:Book} -- and returning an
abstract syntax tree attributed with type information together with
the type environment from type inference. This also would allow
getting rid of the identifiers in the representation of integer
literals, which are used in order to implement overloading of
integer constants.}

\ToDo{When computing the type of an expression with a type signature
make use of the annotation instead of recomputing its type. In order
to do this, we must either ensure that the types are properly
qualified and expanded or we need access to the type constructor
environment.}
\begin{verbatim}

> type TyState a = StateT TypeSubst (St Int) a

> run :: TyState a -> ValueEnv -> a
> run m tyEnv = runSt (callSt m idSubst) 0

> class Typeable a where
>   typeOf :: ValueEnv -> a -> Type

> instance Typeable Ident where
>   typeOf = computeType identType

> instance Typeable ConstrTerm where
>   typeOf = computeType argType

> instance Typeable Expression where
>   typeOf = computeType exprType

> instance Typeable Rhs where
>   typeOf = computeType rhsType

> computeType f tyEnv x = normalize (run doComputeType tyEnv)
>   where doComputeType =
>           do
>             ty <- f tyEnv x
>             theta <- fetchSt
>             return (fixTypeVars tyEnv (subst theta ty))

> fixTypeVars :: ValueEnv -> Type -> Type
> fixTypeVars tyEnv ty = subst (foldr2 bindSubst idSubst tvs tvs') ty
>   where tvs = filter (>= 0) (typeVars ty)
>         tvs' = map TypeVariable [n - 1,n - 2 ..]
>         n = minimum (0 : concatMap typeVars tys)
>         tys = [ty | (_,Value _ (ForAll _ ty)) <- localBindings tyEnv]

> identType :: ValueEnv -> Ident -> TyState Type
> identType tyEnv x = instUniv (varType x tyEnv)

> litType :: ValueEnv -> Literal -> TyState Type
> litType _ (Char _ _)    = return charType
> litType tyEnv (Int v _) = identType tyEnv v
> litType _ (Float _ _)   = return floatType
> litType _ (String _ _)  = return stringType

> argType :: ValueEnv -> ConstrTerm -> TyState Type
> argType tyEnv (LiteralPattern l) = litType tyEnv l
> argType tyEnv (NegativePattern _ l) = litType tyEnv l
> argType tyEnv (VariablePattern v) = identType tyEnv v
> argType tyEnv (ConstructorPattern c ts) =
>   do
>     ty <- instUnivExist (constrType c tyEnv)
>     tys <- mapM (argType tyEnv) ts
>     unifyList (init (flatten ty)) tys
>     return (last (flatten ty))
>   where flatten (TypeArrow ty1 ty2) = ty1 : flatten ty2
>         flatten ty = [ty]
> argType tyEnv (InfixPattern t1 op t2) =
>   argType tyEnv (ConstructorPattern op [t1,t2])
> argType tyEnv (ParenPattern t) = argType tyEnv t
> argType tyEnv (TuplePattern _ ts)
>   | null ts = return unitType
>   | otherwise = liftM tupleType $ mapM (argType tyEnv) ts                -- $
> argType tyEnv (ListPattern _ ts) = freshTypeVar >>= flip elemType ts
>   where elemType ty [] = return (listType ty)
>         elemType ty (t:ts) =
>           argType tyEnv t >>= unify ty >> elemType ty ts
> argType tyEnv (AsPattern v _) = argType tyEnv (VariablePattern v)
> argType tyEnv (LazyPattern _ t) = argType tyEnv t
> argType tyEnv (FunctionPattern f ts) =
>   do 
>     ty <- instUniv (funType f tyEnv)
>     tys <- mapM (argType tyEnv) ts
>     unifyList (init (flatten ty)) tys
>     return (last (flatten ty))
>   where flatten (TypeArrow ty1 ty2) = ty1 : flatten ty2
>         flatten ty = [ty]
> argType tyEnv (InfixFuncPattern t1 op t2) =
>   argType tyEnv (FunctionPattern op [t1,t2])
> argType tyEnv (RecordPattern fs r)
>   | isJust r =
>     do
>       tys <- mapM (fieldPattType tyEnv) fs
>       rty <- argType tyEnv (fromJust r)
>       (TypeVariable i) <- freshTypeVar
>       unify rty (TypeRecord tys (Just i))
>       return rty
>   | otherwise =
>     do
>       tys <- mapM (fieldPattType tyEnv) fs
>       return (TypeRecord tys Nothing)

> fieldPattType :: ValueEnv -> Field ConstrTerm -> TyState (Ident,Type)
> fieldPattType tyEnv (Field _ l t) =
>   do
>     lty <- instUniv (labelType l tyEnv)
>     ty <- argType tyEnv t
>     unify lty ty
>     return (l,lty)

> exprType :: ValueEnv -> Expression -> TyState Type
> exprType tyEnv (Literal l) = litType tyEnv l
> exprType tyEnv (Variable v) = instUniv (funType v tyEnv)
> exprType tyEnv (Constructor c) = instUnivExist (constrType c tyEnv)
> exprType tyEnv (Typed e _) = exprType tyEnv e
> exprType tyEnv (Paren e) = exprType tyEnv e
> exprType tyEnv (Tuple _ es)
>   | null es = return unitType
>   | otherwise = liftM tupleType $ mapM (exprType tyEnv) es
> exprType tyEnv (List _ es) = freshTypeVar >>= flip elemType es
>   where elemType ty [] = return (listType ty)
>         elemType ty (e:es) =
>           exprType tyEnv e >>= unify ty >> elemType ty es
> exprType tyEnv (ListCompr _ e _) = liftM listType $ exprType tyEnv e
> exprType tyEnv (EnumFrom _) = return (listType intType)
> exprType tyEnv (EnumFromThen _ _) = return (listType intType)
> exprType tyEnv (EnumFromTo _ _) = return (listType intType)
> exprType tyEnv (EnumFromThenTo _ _ _) = return (listType intType)
> exprType tyEnv (UnaryMinus _ e) = exprType tyEnv e
> exprType tyEnv (Apply e1 e2) =
>   do
>     (ty1,ty2) <- exprType tyEnv e1 >>= unifyArrow
>     exprType tyEnv e2 >>= unify ty1
>     return ty2
> exprType tyEnv (InfixApply e1 op e2) =
>   do
>     (ty1,ty2,ty3) <- exprType tyEnv (infixOp op) >>= unifyArrow2
>     exprType tyEnv e1 >>= unify ty1
>     exprType tyEnv e2 >>= unify ty2
>     return ty3
> exprType tyEnv (LeftSection e op) =
>   do
>     (ty1,ty2,ty3) <- exprType tyEnv (infixOp op) >>= unifyArrow2
>     exprType tyEnv e >>= unify ty1
>     return (TypeArrow ty2 ty3)
> exprType tyEnv (RightSection op e) =
>   do
>     (ty1,ty2,ty3) <- exprType tyEnv (infixOp op) >>= unifyArrow2
>     exprType tyEnv e >>= unify ty2
>     return (TypeArrow ty1 ty3)
> exprType tyEnv (Lambda _ args e) =
>   do
>     tys <- mapM (argType tyEnv) args
>     ty <- exprType tyEnv e
>     return (foldr TypeArrow ty tys)
> exprType tyEnv (Let _ e) = exprType tyEnv e
> exprType tyEnv (Do _ e) = exprType tyEnv e
> exprType tyEnv (IfThenElse _ e1 e2 e3) =
>   do
>     exprType tyEnv e1 >>= unify boolType
>     ty2 <- exprType tyEnv e2
>     ty3 <- exprType tyEnv e3
>     unify ty2 ty3
>     return ty3
> exprType tyEnv (Case _ _ alts) = freshTypeVar >>= flip altType alts
>   where altType ty [] = return ty
>         altType ty (Alt _ _ rhs:alts) =
>           rhsType tyEnv rhs >>= unify ty >> altType ty alts
> exprType tyEnv (RecordConstr fs) =
>   do 
>     tys <- mapM (fieldExprType tyEnv) fs
>     return (TypeRecord tys Nothing)
> exprType tyEnv (RecordSelection r l) =
>   do 
>     lty <- instUniv (labelType l tyEnv)
>     rty <- exprType tyEnv r
>     (TypeVariable i) <- freshTypeVar
>     unify rty (TypeRecord [(l,lty)] (Just i))
>     return lty
> exprType tyEnv (RecordUpdate fs r) =
>   do
>     tys <- mapM (fieldExprType tyEnv) fs
>     rty <- exprType tyEnv r
>     (TypeVariable i) <- freshTypeVar
>     unify rty (TypeRecord tys (Just i))
>     return rty

> rhsType :: ValueEnv -> Rhs -> TyState Type
> rhsType tyEnv (SimpleRhs _ e _) = exprType tyEnv e
> rhsType tyEnv (GuardedRhs es _) = freshTypeVar >>= flip condExprType es
>   where condExprType ty [] = return ty
>         condExprType ty (CondExpr _ _ e:es) =
>           exprType tyEnv e >>= unify ty >> condExprType ty es

> fieldExprType :: ValueEnv -> Field Expression -> TyState (Ident,Type)
> fieldExprType tyEnv (Field _ l e) =
>   do
>     lty <- instUniv (labelType l tyEnv)
>     ty <- exprType tyEnv e
>     unify lty ty
>     return (l,lty)

\end{verbatim}
In order to avoid name conflicts with non-generalized type variables
in a type we instantiate quantified type variables using non-negative
offsets here.
\begin{verbatim}

> freshTypeVar :: TyState Type
> freshTypeVar = liftM TypeVariable $ liftSt $ updateSt (1 +)

> instType :: Int -> Type -> TyState Type
> instType n ty =
>   do
>     tys <- sequence (replicate n freshTypeVar)
>     return (expandAliasType tys ty)

> instUniv :: TypeScheme -> TyState Type
> instUniv (ForAll n ty) = instType n ty

> instUnivExist :: ExistTypeScheme -> TyState Type
> instUnivExist (ForAllExist n n' ty) = instType (n + n') ty

\end{verbatim}
When unifying two types, the non-generalized variables, i.e.,
variables with negative offsets, must not be substituted. Otherwise,
the unification algorithm is identical to the one used by the type
checker.
\begin{verbatim}

> unify :: Type -> Type -> TyState ()
> unify ty1 ty2 =
>   updateSt_ (\theta -> unifyTypes (subst theta ty1) (subst theta ty2) theta)

> unifyList :: [Type] -> [Type] -> TyState ()
> unifyList tys1 tys2 = sequence_ (zipWith unify tys1 tys2)

> unifyArrow :: Type -> TyState (Type,Type)
> unifyArrow ty =
>   do
>     theta <- fetchSt
>     case subst theta ty of
>       TypeVariable tv
>         | tv >= 0 ->
>             do
>               ty1 <- freshTypeVar
>               ty2 <- freshTypeVar
>               updateSt_ (bindVar tv (TypeArrow ty1 ty2))
>               return (ty1,ty2)
>       TypeArrow ty1 ty2 -> return (ty1,ty2)
>       ty' -> internalError ("unifyArrow (" ++ show ty' ++ ")")

> unifyArrow2 :: Type -> TyState (Type,Type,Type)
> unifyArrow2 ty =
>   do
>     (ty1,ty2) <- unifyArrow ty
>     (ty21,ty22) <- unifyArrow ty2
>     return (ty1,ty21,ty22)

> unifyTypes :: Type -> Type -> TypeSubst -> TypeSubst
> unifyTypes (TypeVariable tv1) (TypeVariable tv2) theta
>   | tv1 == tv2 = theta
> unifyTypes (TypeVariable tv) ty theta
>   | tv >= 0 = bindVar tv ty theta
> unifyTypes ty (TypeVariable tv) theta
>   | tv >= 0 = bindVar tv ty theta
> unifyTypes (TypeConstructor tc1 tys1) (TypeConstructor tc2 tys2) theta
>   | tc1 == tc2 = foldr2 unifyTypes theta tys1 tys2
> unifyTypes (TypeConstrained tys1 tv1) (TypeConstrained tys2 tv2) theta
>   | tv1 == tv2 = theta
> unifyTypes (TypeArrow ty11 ty12) (TypeArrow ty21 ty22) theta =
>   unifyTypes ty11 ty21 (unifyTypes ty12 ty22 theta)
> unifyTypes (TypeSkolem k1) (TypeSkolem k2) theta
>   | k1 == k2 = theta
> unifyTypes (TypeRecord fs1 Nothing) (TypeRecord fs2 Nothing) theta
>   | length fs1 == length fs2 = foldr (unifyTypedLabels fs1) theta fs2
> unifyTypes tr1@(TypeRecord fs1 Nothing) (TypeRecord fs2 (Just a2)) theta =
>   unifyTypes (TypeVariable a2)
>              tr1
>              (foldr (unifyTypedLabels fs1) theta fs2)
> unifyTypes tr1@(TypeRecord _ (Just _)) tr2@(TypeRecord _ Nothing) theta =
>   unifyTypes tr2 tr1 theta
> unifyTypes (TypeRecord fs1 (Just a1)) (TypeRecord fs2 (Just a2)) theta =
>   unifyTypes (TypeVariable a1)
>              (TypeVariable a2)
>              (foldr (unifyTypedLabels fs1) theta fs2)
> unifyTypes ty1 ty2 _ =
>   internalError ("unify: (" ++ show ty1 ++ ") (" ++ show ty2 ++ ")")

> unifyTypedLabels :: [(Ident,Type)] -> (Ident,Type) -> TypeSubst -> TypeSubst
> unifyTypedLabels fs1 (l,ty) theta =
>   maybe theta (\ty1 -> unifyTypes ty1 ty theta) (lookup l fs1)

\end{verbatim}
The functions \texttt{constrType}, \texttt{varType}, and
\texttt{funType} are used for computing the type of constructors,
pattern variables, and variables.

\ToDo{These functions should be shared with the type checker.}
\begin{verbatim}

> constrType :: QualIdent -> ValueEnv -> ExistTypeScheme
> constrType c tyEnv =
>   case qualLookupValue c tyEnv of
>     [DataConstructor _ sigma] -> sigma
>     [NewtypeConstructor _ sigma] -> sigma
>     _ -> internalError ("constrType " ++ show c)

> varType :: Ident -> ValueEnv -> TypeScheme
> varType v tyEnv =
>   case lookupValue v tyEnv of
>     [Value _ sigma] -> sigma
>     _ -> internalError ("varType " ++ show v)

> funType :: QualIdent -> ValueEnv -> TypeScheme
> funType f tyEnv =
>   case qualLookupValue f tyEnv of
>     [Value _ sigma] -> sigma
>     _ -> internalError ("funType " ++ show f)

> labelType :: Ident -> ValueEnv -> TypeScheme
> labelType l tyEnv =
>   case lookupValue l tyEnv of
>     [Label _ _ sigma] -> sigma
>     _ -> internalError ("labelType " ++ show l)

\end{verbatim}