{- Copyright 2009 Jake Wheat This file contains the attr and sem definitions, which do the type checking, etc.. A lot of the haskell code has been moved into other files: TypeCheckingH, AstUtils.lhs, it is intended that only small amounts of code appear (i.e. one-liners) inline in this file, and larger bits go in AstUtils.lhs. These are only divided because the attribute grammar system uses a custom syntax with a custom preprocessor. These guidelines aren't followed very well. The current type checking approach doesn't quite match how SQL works. The main problem is that you can e.g. exec create table statements inside a function. This is something that the type checker will probably not be able to deal for a while if ever. (Will need hooks into postgresql to do this properly, which might not be impossible...). Once most of the type checking is working, all the code and documentation will be overhauled quite a lot. Alternatively put, this code is in need of better documentation and organisation, and serious refactoring. An unholy mixture of Maybe, Either, Either/Error Monad and TypeCheckFailed is used to do error handling. One of the common patterns is when type checking a node: first check any types of subnodes which it depends on, if any are typecheckfailed, then this node's type is typecheckfailed and we stop there. otherwise, calculate the type of this node, or get an error if there is a problem, this is put into loc.tpe which is Either TypeError Type. Then some common code takes this value and sets the node type to the type or typecheckfailed if tpe is left, and if it is left add a type error annotation also. ================================================================================ = main attributes used Here are the main attributes used in the type checking: env is used to chain the environments up and down the tree, to allow access to the catalog information, and to store the in environment identifier names and types e.g. inside a select expression. annotatedTree is used to create a copy of the ast with the type, etc. annotations. -} ATTR AllNodes Root ExpressionRoot [ env : Environment || annotatedTree : SELF] ATTR StatementList Root [|| producedEnv : Environment] {- ================================================================================ = expressions -} { annTypesAndErrors :: Data a => a -> Type -> [TypeError] -> Maybe [AnnotationElement] -> a annTypesAndErrors item nt errs add = updateAnnotation modifier item where modifier = (([TypeAnnotation nt] ++ fromMaybe [] add ++ map TypeErrorA errs) ++) } SEM Expression | IntegerLit StringLit FloatLit BooleanLit NullLit FunCall Identifier Exists Case CaseSimple Cast InPredicate ScalarSubQuery --PositionalArg WindowFn lhs.annotatedTree = annTypesAndErrors @loc.backTree (errorToTypeFail @loc.tpe) (getErrors @loc.tpe) Nothing {- == literals -} SEM Expression | IntegerLit loc.backTree = IntegerLit @ann @i | StringLit loc.backTree = StringLit @ann @quote @value | FloatLit loc.backTree = FloatLit @ann @d | BooleanLit loc.backTree = BooleanLit @ann @b | NullLit loc.backTree = NullLit @ann SEM Expression | IntegerLit loc.tpe = Right typeInt | StringLit loc.tpe = Right UnknownStringLit | FloatLit loc.tpe = Right typeNumeric | BooleanLit loc.tpe = Right typeBool -- I think a null has the same type resolution as an unknown string lit | NullLit loc.tpe = Right UnknownStringLit {- == cast expression -} SEM Expression | Cast loc.tpe = Right $ @tn.namedType loc.backTree = Cast @ann @expr.annotatedTree @tn.annotatedTree {- == type names Types with type modifiers (called PrecTypeName here, to be changed), are not supported at the moment. -} ATTR TypeName [||namedType : Type] SEM TypeName | SimpleTypeName ArrayTypeName SetOfTypeName PrecTypeName lhs.namedType = errorToTypeFail @loc.tpe lhs.annotatedTree = updateAnnotation ((map TypeErrorA $ getErrors @loc.tpe) ++) @loc.backTree SEM TypeName | SimpleTypeName loc.tpe = envLookupType @lhs.env $ canonicalizeTypeName @tn loc.backTree = SimpleTypeName @ann @tn | ArrayTypeName loc.tpe = chainTypeCheckFailed [@typ.namedType] $ Right $ ArrayType @typ.namedType loc.backTree = ArrayTypeName @ann @typ.annotatedTree | SetOfTypeName loc.tpe = chainTypeCheckFailed [@typ.namedType] $ Right $ SetOfType @typ.namedType loc.backTree = SetOfTypeName @ann @typ.annotatedTree | PrecTypeName loc.tpe = Right TypeCheckFailed loc.backTree = PrecTypeName @ann @tn @prec {- == operators and functions -} SEM Expression | FunCall loc.tpe = chainTypeCheckFailed @args.typeList $ typeCheckFunCall @lhs.env @funName @args.typeList loc.backTree = FunCall @ann @funName @args.annotatedTree {- == case expression for non simple cases, we need all the when expressions to be bool, and then to collect the types of the then parts to see if we can resolve a common type for simple cases, we need to check all the when parts have the same type as the value to check against, then we collect the then parts as above. -} SEM Expression | Case CaseSimple loc.whenTypes = map getTypeAnnotation $ concatMap fst $ @cases.annotatedTree loc.thenTypes = map getTypeAnnotation $ (map snd $ @cases.annotatedTree) ++ maybeToList @els.annotatedTree SEM Expression | Case loc.tpe = chainTypeCheckFailed @loc.whenTypes $ do when (any (/= typeBool) @loc.whenTypes) $ Left [WrongTypes typeBool @loc.whenTypes] chainTypeCheckFailed @loc.thenTypes $ resolveResultSetType @lhs.env @loc.thenTypes loc.backTree = Case @ann @cases.annotatedTree @els.annotatedTree SEM Expression | CaseSimple loc.tpe = chainTypeCheckFailed @loc.whenTypes $ do checkWhenTypes <- resolveResultSetType @lhs.env (getTypeAnnotation @value.annotatedTree: @loc.whenTypes) chainTypeCheckFailed @loc.thenTypes $ resolveResultSetType @lhs.env @loc.thenTypes loc.backTree = CaseSimple @ann @value.annotatedTree @cases.annotatedTree @els.annotatedTree {- == identifiers pull id types out of env for identifiers -} SEM Expression | Identifier loc.tpe = let (correlationName,iden) = splitIdentifier @i in envLookupID @lhs.env correlationName iden loc.backTree = Identifier @ann @i SEM Expression | Exists loc.tpe = Right typeBool loc.backTree = Exists @ann @sel.annotatedTree {- == scalar subquery 1 col -> type of that col 2 + cols -> row type -} SEM Expression | ScalarSubQuery loc.tpe = let selType = getTypeAnnotation @sel.annotatedTree in chainTypeCheckFailed [selType] $ do f <- map snd <$> unwrapSetOfComposite selType case length f of 0 -> Left [InternalError "no columns in scalar subquery?"] 1 -> Right $ head f _ -> Right $ RowCtor f loc.backTree = ScalarSubQuery @ann @sel.annotatedTree {- == inlist -} SEM Expression | InPredicate loc.tpe = do lt <- @list.listType ty <- resolveResultSetType @lhs.env [getTypeAnnotation @expr.annotatedTree, lt] return typeBool loc.backTree = InPredicate @ann @expr.annotatedTree @i @list.annotatedTree ATTR InList [||listType : {Either [TypeError] Type}] SEM InList | InList lhs.listType = resolveResultSetType @lhs.env @exprs.typeList | InSelect lhs.listType = do attrs <- map snd <$> (unwrapSetOfComposite $ let a = getTypeAnnotation @sel.annotatedTree in {-trace ("attrs is: " ++ show a) $-} a) typ <- case length attrs of 0 -> Left [InternalError "got subquery with no columns? in inselect"] 1 -> Right $ head attrs _ -> Right $ RowCtor attrs chainTypeCheckFailed attrs $ Right typ ATTR ExpressionList [||typeList : {[Type]}] SEM ExpressionList | Cons lhs.typeList = getTypeAnnotation @hd.annotatedTree : @tl.typeList | Nil lhs.typeList = [] ATTR ExpressionListList [||typeListList : {[[Type]]}] SEM ExpressionListList | Cons lhs.typeListList = @hd.typeList : @tl.typeListList | Nil lhs.typeListList = [] {- ================================================================================ = statements -} SEM Statement | SelectStatement Insert Update Delete CreateView CreateDomain CreateFunction CreateType CreateTable Return lhs.annotatedTree = annTypesAndErrors @loc.backTree (errorToTypeFail @loc.tpe) (getErrors @loc.tpe) $ Just (map StatementInfoA @loc.statementInfo ++ [EnvUpdates @loc.envUpdates]) lhs.envUpdates = @loc.envUpdates ATTR Statement [||envUpdates : {[EnvironmentUpdate]}] SEM Statement | Assignment CaseStatement ContinueStatement Copy CopyData DropFunction DropSomething Execute ExecuteInto ForIntegerStatement ForSelectStatement If NullStatement Perform Raise ReturnNext ReturnQuery Truncate WhileStatement lhs.envUpdates = [] ATTR StatementList [ envUpdates : {[EnvironmentUpdate]}||] SEM Root | Root statements.envUpdates = [] SEM StatementList | Cons loc.newEnv = fromRight @lhs.env $ updateEnvironment @lhs.env @lhs.envUpdates hd.env = @loc.newEnv tl.env = @loc.newEnv lhs.producedEnv = case @tl.annotatedTree of [] -> @loc.newEnv _ -> @tl.producedEnv tl.envUpdates = @hd.envUpdates | Nil lhs.producedEnv = emptyEnvironment SEM ExpressionListStatementListPair | Tuple x2.envUpdates = [] SEM ExpressionStatementListPair | Tuple x2.envUpdates = [] SEM FnBody | PlpgsqlFnBody SqlFnBody sts.envUpdates = [] SEM Statement | CaseStatement If els.envUpdates = [] SEM Statement | ForIntegerStatement ForSelectStatement WhileStatement sts.envUpdates = [] {- ================================================================================ = basic select statements This is a bit of a mess, will be rewritten with a proper literate flavour once all the different bits are type checking ok, which should make it much more readable. -} SEM Statement | SelectStatement loc.tpe = chainTypeCheckFailed [getTypeAnnotation @ex.annotatedTree] $ Right $ Pseudo Void loc.statementInfo = [SelectInfo $ getTypeAnnotation @ex.annotatedTree] loc.backTree = SelectStatement @ann @ex.annotatedTree loc.envUpdates = [] SEM SelectExpression | Values Select CombineSelect lhs.annotatedTree = annTypesAndErrors @loc.backTree (errorToTypeFail @loc.tpe) (getErrors @loc.tpe) Nothing SEM SelectExpression | Values loc.tpe = typeCheckValuesExpr @lhs.env @vll.typeListList loc.backTree = Values @ann @vll.annotatedTree | Select loc.tpe = do let trefType = let ts = @selTref.annotatedTree in case null ts of True -> typeBool _ -> getTypeAnnotation $ head ts {-fromMaybe typeBool $ fmap getTypeAnnotation @selTref.annotatedTree-} slType = @selSelectList.listType chainTypeCheckFailed [trefType, slType] $ Right $ case slType of UnnamedCompositeType [(_,Pseudo Void)] -> Pseudo Void _ -> SetOfType slType loc.backTree = Select @ann @selDistinct.annotatedTree @selSelectList.annotatedTree @selTref.annotatedTree @selWhere.annotatedTree @selGroupBy.annotatedTree @selHaving.annotatedTree @selOrderBy.annotatedTree @selLimit.annotatedTree @selOffset.annotatedTree | CombineSelect loc.tpe = let sel1t = getTypeAnnotation @sel1.annotatedTree sel2t = getTypeAnnotation @sel2.annotatedTree in chainTypeCheckFailed [sel1t, sel2t] $ typeCheckCombineSelect @lhs.env sel1t sel2t loc.backTree = CombineSelect @ann @ctype.annotatedTree @sel1.annotatedTree @sel2.annotatedTree { getTbCols c = unwrapSetOfComposite (getTypeAnnotation c) } ATTR TableRef TableRefList [||idens : {[(String,([(String,Type)],[(String,Type)]))]} joinIdens : {[String]} ] SEM TableRef | SubTref TrefAlias Tref TrefFun TrefFunAlias JoinedTref lhs.annotatedTree = annTypesAndErrors @loc.backTree (errorToTypeFail @loc.tpe) (getErrors @loc.tpe) Nothing -- one of the main hairy bits of code which needs a serious refactor: SEM TableRef | SubTref loc.tpe = chainTypeCheckFailed [getTypeAnnotation @sel.annotatedTree] <$> unwrapSetOfWhenComposite $ getTypeAnnotation @sel.annotatedTree loc.backTree = SubTref @ann @sel.annotatedTree @alias lhs.idens = [(@alias, (fromRight [] $ getTbCols @sel.annotatedTree, []))] lhs.joinIdens = [] | TrefAlias Tref loc.tpe = either Left (Right . fst) @loc.relType lhs.joinIdens = [] loc.relType = getRelationType @lhs.env @tbl loc.unwrappedRelType = fromRight ([],[]) $ do lrt <- @loc.relType let (UnnamedCompositeType a,UnnamedCompositeType b) = lrt return (a,b) | Tref lhs.idens = [(@tbl, @loc.unwrappedRelType)] loc.backTree = Tref @ann @tbl | TrefAlias lhs.idens = [(@alias, @loc.unwrappedRelType)] loc.backTree = TrefAlias @ann @tbl @alias | TrefFun TrefFunAlias loc.tpe = getFnType @lhs.env @loc.alias1 @fn.annotatedTree lhs.joinIdens = [] lhs.idens = case getFunIdens @lhs.env @loc.alias1 @fn.annotatedTree of Right (s, UnnamedCompositeType c) -> [(s,(c,[]))] _ -> [] | TrefFun loc.alias1 = "" loc.backTree = TrefFun @ann @fn.annotatedTree | TrefFunAlias loc.alias1 = @alias loc.backTree = TrefFunAlias @ann @fn.annotatedTree @alias | JoinedTref loc.tpe = chainTypeCheckFailed [tblt ,tbl1t] $ case (@nat.annotatedTree, @onExpr.annotatedTree) of (Natural, _) -> unionJoinList $ commonFieldNames tblt tbl1t (_,Just (JoinUsing _ s)) -> unionJoinList s _ -> unionJoinList [] where tblt = getTypeAnnotation @tbl.annotatedTree tbl1t = getTypeAnnotation @tbl1.annotatedTree unionJoinList s = combineTableTypesWithUsingList @lhs.env s tblt tbl1t lhs.idens = @tbl.idens ++ @tbl1.idens lhs.joinIdens = commonFieldNames (getTypeAnnotation @tbl.annotatedTree) (getTypeAnnotation @tbl1.annotatedTree) loc.backTree = JoinedTref @ann @tbl.annotatedTree @nat.annotatedTree @joinType.annotatedTree @tbl1.annotatedTree @onExpr.annotatedTree { getFnType :: Environment -> String -> Expression -> Either [TypeError] Type getFnType env alias = either Left (Right . snd) . getFunIdens env alias getFunIdens :: Environment -> String -> Expression -> Either [TypeError] (String,Type) getFunIdens env alias fnVal = case fnVal of FunCall _ f _ -> let correlationName = if alias /= "" then alias else f in Right (correlationName, case getTypeAnnotation fnVal of SetOfType (CompositeType t) -> getCompositeType t SetOfType x -> UnnamedCompositeType [(correlationName,x)] y -> UnnamedCompositeType [(correlationName,y)]) x -> Left [ContextError "FunCall"] where getCompositeType t = case getAttrs env [Composite ,TableComposite ,ViewComposite] t of Just (_,_,a@(UnnamedCompositeType _), _) -> a _ -> UnnamedCompositeType [] } SEM TableRefList | Cons lhs.idens = @hd.idens lhs.joinIdens = @hd.joinIdens | Nil lhs.idens = [] lhs.joinIdens = [] ATTR SelectItemList SelectList [||listType : Type] SEM SelectItemList | Cons lhs.listType = doSelectItemListTpe @lhs.env @hd.columnName @hd.itemType @tl.listType | Nil lhs.listType = UnnamedCompositeType [] ATTR SelectItem [||itemType : Type] SEM SelectItem | SelExp SelectItem lhs.itemType = getTypeAnnotation @ex.annotatedTree -- hack to fix up for errors for ok * SEM SelectItem | SelExp loc.annotatedTree = SelExp @ann $ fixStar @ex.annotatedTree | SelectItem loc.backTree = SelectItem @ann (fixStar @ex.annotatedTree) @name { fixStar :: Expression -> Expression fixStar = everywhere (mkT fixStar') where fixStar' :: Annotation -> Annotation fixStar' a = if TypeAnnotation TypeCheckFailed `elem` a && any (\an -> case an of TypeErrorA (UnrecognisedIdentifier x) | let (_,iden) = splitIdentifier x in iden == "*" -> True _ -> False) a then filter (\an -> case an of TypeAnnotation TypeCheckFailed -> False TypeErrorA (UnrecognisedIdentifier _) -> False _ -> True) a else a } SEM SelectList | SelectList lhs.listType = @items.listType {- == env passing env flow: current simple version: from tref -> select list -> where (so we take the identifiers and types from the tref part, and send them into the selectlist and where parts) full order of identifier passing: 1. from 2. where 3. group by 4. having 5. select -} SEM SelectExpression | Select loc.newEnv = case updateEnvironment @lhs.env (convertToNewStyleUpdates @selTref.idens @selTref.joinIdens) of Left x -> error $ show x -- @lhs.env Right e -> e selSelectList.env = @loc.newEnv selWhere.env = @loc.newEnv {- == attributes columnName is used to collect the column names that the select list produces, it is combined into an unnamedcompositetype in selectitemlist, which is also where star expansion happens. -} ATTR SelectItem [||columnName : String] {- if the select item is just an identifier, then that column is named after the identifier e.g. select a, b as c, b + c from d, gives three columns one named a, one named c, and one unnamed, even though only one has an alias if the select item is a function or aggregate call at the top level, then it is named after that function or aggregate if it is a cast, the column is named after the target data type name iff it is a simple type name -} --default value for non identifier nodes ATTR Expression [||liftedColumnName USE {`(fixedValue "")`} {""}: String] { fixedValue :: a -> a -> a -> a fixedValue a _ _ = a } {- override for identifier nodes, this only makes it out to the selectitem node if the identifier is not wrapped in parens, function calls, etc. -} SEM Expression | Identifier lhs.liftedColumnName = @i | FunCall lhs.liftedColumnName = if isOperatorName @funName then "" else @funName | Cast lhs.liftedColumnName = case @tn.annotatedTree of SimpleTypeName _ tn -> tn _ -> "" -- collect the aliases and column names for use by the selectitemlist nodes SEM SelectItem | SelExp lhs.columnName = case @ex.liftedColumnName of "" -> "?column?" s -> s | SelectItem lhs.columnName = @name {- ================================================================================ = insert -} { getCAtts t = case t of SetOfType (UnnamedCompositeType t) -> t _ -> [] } SEM Statement | Insert loc.columnStuff = checkColumnConsistency @lhs.env @table @targetCols.strings (getCAtts $ getTypeAnnotation @insData.annotatedTree) loc.tpe = chainTypeCheckFailed [getTypeAnnotation @insData.annotatedTree] $ do @loc.columnStuff Right $ Pseudo Void loc.statementInfo = [InsertInfo @table $ errorToTypeFailF UnnamedCompositeType @loc.columnStuff] loc.backTree = Insert @ann @table @targetCols.annotatedTree @insData.annotatedTree @returning loc.envUpdates = [] ATTR StringList [||strings : {[String]}] SEM StringList | Cons lhs.strings = @hd : @tl.strings | Nil lhs.strings = [] {- ================================================================================ = update -} SEM Statement | Update loc.tpe = do let re = checkRelationExists @lhs.env @table when (isJust re) $ Left [fromJust $ re] chainTypeCheckFailed (map snd @assigns.pairs) $ do @loc.columnsConsistent checkErrorList @assigns.rowSetErrors $ Pseudo Void loc.columnsConsistent = checkColumnConsistency @lhs.env @table (map fst @assigns.pairs) @assigns.pairs loc.statementInfo = [UpdateInfo @table $ errorToTypeFailF UnnamedCompositeType @loc.columnsConsistent] loc.backTree = Update @ann @table @assigns.annotatedTree @whr.annotatedTree @returning loc.envUpdates = [] ATTR SetClauseList [||pairs : {[(String,Type)]} rowSetErrors : {[TypeError]}] SEM SetClauseList | Cons lhs.pairs = @hd.pairs ++ @tl.pairs lhs.rowSetErrors = maybeToList @hd.rowSetError ++ @tl.rowSetErrors | Nil lhs.pairs = [] lhs.rowSetErrors = [] ATTR SetClause [||pairs : {[(String,Type)]} rowSetError : {Maybe TypeError}] SEM SetClause | SetClause lhs.pairs = [(@att, getTypeAnnotation @val.annotatedTree)] lhs.rowSetError = Nothing | RowSetClause loc.rowSetError = let atts = @atts.strings types = getRowTypes @vals.typeList in if length atts /= length types then Just WrongNumberOfColumns else Nothing lhs.pairs = zip @atts.strings $ getRowTypes @vals.typeList { getRowTypes :: [Type] -> [Type] getRowTypes [RowCtor ts] = ts getRowTypes ts = ts } {- ================================================================================ = delete -} SEM Statement | Delete loc.tpe = case checkRelationExists @lhs.env @table of Just e -> Left [e] Nothing -> Right $ Pseudo Void loc.statementInfo = [DeleteInfo @table] loc.backTree = Delete @ann @table @whr.annotatedTree @returning loc.envUpdates = [] {- ================================================================================ = create table env needs to be modified: types, typenames, typecats, attrdefs, systemcolumns produces a compositedef: (name, tablecomposite, unnamedcomp [(attrname, type)]) -} ATTR AttributeDef [||attrName : String namedType : Type] SEM AttributeDef | AttributeDef lhs.attrName = @name lhs.namedType = @typ.namedType ATTR AttributeDefList [||attrs : {[(String, Type)]}] SEM AttributeDefList | Cons lhs.attrs = (@hd.attrName, @hd.namedType) : @tl.attrs | Nil lhs.attrs = [] SEM Statement | CreateTable loc.attrTypes = map snd @atts.attrs loc.tpe = Right $ Pseudo Void loc.backTree = CreateTable @ann @name @atts.annotatedTree @cons.annotatedTree loc.statementInfo = [] loc.envUpdates = [EnvCreateTable @name @atts.attrs []] SEM Statement | CreateTableAs loc.selType = getTypeAnnotation @expr.annotatedTree loc.tpe = Right @loc.selType loc.backTree = CreateTableAs @ann @name @expr.annotatedTree loc.statementInfo = [] loc.attrs = case @loc.selType of UnnamedCompositeType c -> c _-> [] loc.envUpdates = [EnvCreateTable @name @loc.attrs []] {- ================================================================================ = create view -} SEM Statement | CreateView loc.tpe = chainTypeCheckFailed [getTypeAnnotation @expr.annotatedTree] $ Right $ Pseudo Void loc.backTree = CreateView @ann @name @expr.annotatedTree loc.statementInfo = [] loc.attrs = case getTypeAnnotation @expr.annotatedTree of SetOfType (UnnamedCompositeType c) -> c _ -> [] loc.envUpdates = [EnvCreateView @name @loc.attrs] {- ================================================================================ = create type -} ATTR TypeAttributeDef [||attrName : String namedType : Type] SEM TypeAttributeDef | TypeAttDef lhs.attrName = @name lhs.namedType = @typ.namedType ATTR TypeAttributeDefList [||attrs : {[(String, Type)]}] SEM TypeAttributeDefList | Cons lhs.attrs = (@hd.attrName, @hd.namedType) : @tl.attrs | Nil lhs.attrs = [] SEM Statement | CreateType loc.tpe = Right $ Pseudo Void loc.backTree = CreateType @ann @name @atts.annotatedTree loc.statementInfo = [] loc.envUpdates = [EnvCreateComposite @name @atts.attrs] {- ================================================================================ = create domain -} SEM Statement | CreateDomain loc.tpe = Right $ Pseudo Void loc.backTree = CreateDomain @ann @name @typ.annotatedTree @check.annotatedTree loc.statementInfo = [] loc.envUpdates = [EnvCreateDomain (ScalarType @name) @typ.namedType] {- ================================================================================ = create function get the signature, does some checking of the body, still a bit limited ISSUE: when writing an sql file, you can put a create function which refers to a table definition that is given later. As long as the function isn't called before the table definition is given, this is ok. To handle this, need to gather the function prototype, but delay checking the contents until either a) all the other type checking has been done, or b) the function is needed (list ways this can happen: used in a view (even then, not needed until view is used), function can be called directly, or indirectly in another function call, ...) No thoughts on how to do this - but at some point want to support 'declarative' sql source code, where the order doesn't matter, and this code figures out an order to load it into the database which will get past pgs checks, so hopefully the solution will move towards this goal also. One additional consideration is that the error message in a situation like this would be really helpful if it could tell that a problem like this could be fixed with a reordering, and suggest that reordering. -} ATTR ParamDef [||paramName : String] ATTR ParamDefList [||params : {[(String, Type)]}] ATTR ParamDef [||namedType : Type] SEM ParamDef | ParamDef ParamDefTp lhs.namedType = @typ.namedType | ParamDef lhs.paramName = @name | ParamDefTp lhs.paramName = "" SEM ParamDefList | Nil lhs.params = [] | Cons lhs.params = ((@hd.paramName, @hd.namedType) : @tl.params) SEM Statement | CreateFunction loc.tpe = Right $ Pseudo Void loc.backTree = CreateFunction @ann @lang.annotatedTree @name @params.annotatedTree @rettype.annotatedTree @bodyQuote @body.annotatedTree @vol.annotatedTree loc.statementInfo = [] loc.envUpdates = [EnvCreateFunction FunName @name (map snd @params.params) @rettype.namedType] --add the parameters to the environment for the contained statements body.env = fromRight @lhs.env $ updateEnvironment @lhs.env [EnvStackIDs [("", @params.params) ,(@name, @params.params)]] SEM FnBody | PlpgsqlFnBody sts.env = fromRight @lhs.env $ updateEnvironment @lhs.env [EnvStackIDs [("", @vars.defs)]] ATTR VarDef [||def : {(String,Type)}] ATTR VarDefList [||defs : {[(String,Type)]}] SEM VarDef | VarDef lhs.def = (@name, @typ.namedType) SEM VarDefList | Cons lhs.defs = @hd.def : @tl.defs | Nil lhs.defs = [] {- ================================================================================ = statement common components -} SEM MaybeBoolExpression | Just lhs.annotatedTree = if getTypeAnnotation @just.annotatedTree `notElem` [typeBool, TypeCheckFailed] then Just $ updateAnnotation ((TypeErrorA ExpressionMustBeBool) :) @just.annotatedTree else Just $ @just.annotatedTree ATTR MaybeExpression [||exprType : {Maybe Type}] SEM MaybeExpression | Just lhs.exprType = Just $ getTypeAnnotation @just.annotatedTree | Nothing lhs.exprType = Nothing {- ================================================================================ = plpgsql statements -} SEM Statement | Return loc.tpe = chainTypeCheckFailed [fromMaybe typeBool @value.exprType] $ Right $ Pseudo Void loc.backTree = Return @ann @value.annotatedTree loc.envUpdates = [] loc.statementInfo = [] {- ================================================================================ = static tests Try to use a list of message data types to hold all sorts of information which works its way out to the top level where the client code gets it. Want to have the lists concatenated together automatically from subnodes to parent node, and then to be able to add extra messages to this list at each node also. Problem 1: can't have two sem statements for the same node type which both add messages, and then the messages get combined to provide the final message list attribute value for that node. You want this so that e.g. that different sorts of checks appear in different sections. Workaround is instead of having each check in it's own section, to combine them all into one SEM. Problem 2: no shorthand to combine what the default rule for messages would be and then add a bit extra - so if you want all the children messages, plus possibly an extra message or two, have to write out the default rule in full explicitly. Can get round this by writing out loads of code. Both the workarounds to these seem a bit tedious and error prone, and will make the code much less readable. Maybe need a preprocessor to produce the ag file? Alternatively, just attach the messages to each node (so this appears in the data types and isn't an attribute, then have a tree walker collect them all). Since an annotation field in each node is going to be added anyway, so each node can be labelled with a type, will probably do this at some point. ================================================================================ = inloop testing inloop - use to check continue, exit, and other commands that can only appear inside loops (for, while, loop) the only nodes that really need this attribute are the ones which can contain statements The inloop test is the only thing which uses the messages atm. It shouldn't, at some point inloop testing will become part of the type checking. This is just some example code, will probably do something a lot more heavy weight like symbolic interpretation - want to do all sorts of loop, return, nullability, etc. analysis. -} {- ATTR AllNodes Root ExpressionRoot [||messages USE {++} {[]} : {[Message]}] ATTR AllNodes [ inLoop: Bool||] SEM Root | Root statements.inLoop = False SEM ExpressionRoot | ExpressionRoot expr.inLoop = False -- set the inloop stuff which nests, it's reset inside a create -- function statement, in case you have a create function inside a -- loop, seems unlikely you'd do this though SEM Statement | ForSelectStatement ForIntegerStatement WhileStatement sts.inLoop = True | CreateFunction body.inLoop = False -- now we can check when we hit a continue statement if it is in the -- right context SEM Statement | ContinueStatement lhs.messages = if not @lhs.inLoop then [Error ContinueNotInLoop] else [] -} {- ================================================================================ = notes and todo containment guide for select expressions: combineselect 2 selects insert ?select createtableas 1 select createview 1 select return query 1 select forselect 1 select select->subselect select expression->exists select scalarsubquery select inselect select containment guide for statements: forselect [statement] forinteger [statement] while [statement] casestatement [[statement]] if [[statement]] createfunction->fnbody [Statement] TODO some non type-check checks: check plpgsql only in plpgsql function orderby in top level select only copy followed immediately by copydata iff stdin, copydata only follows copy from stdin count args to raise, etc., check same number as placeholders in string no natural with onexpr in joins typename -> setof (& fix parsing), what else like this? expressions: positionalarg in function, window function only in select list top level review all ast checks, and see if we can also catch them during parsing (e.g. typeName parses setof, but this should only be allowed for a function return, and we can make this a parse error when parsing from source code rather than checking a generated ast. This needs judgement to say whether a parse error is better than a check error, I think for setof it is, but e.g. for a continue not in a loop (which could be caught during parsing) works better as a check error, looking at the error message the user will get. This might be wrong, haven't thought too carefully about it yet). TODO: canonicalize ast process, as part of type checking produces a canonicalized ast which: all implicit casts appear explicitly in the ast (maybe distinguished from explicit casts?) all names fully qualified all types use canonical names literal values and selectors in one form (use row style?) nodes are tagged with types what else? Canonical form only defined for type consistent asts. This canonical form should pretty print and parse back to the same form, and type check correctly. -}