= Current TODO list

TODOs in rough order that they are intended to be done

provide type check errors with source positions
sort out source position collection in parser
investigate syb for annotation instances
use f :: Annotation -> Doc for pretty printing annotations
work on statement info
get scope updating whilst checking
start work on type checking inside functions - start with params,
   return types, non plpgsql statements, stuff with selects (e.g. for).
work on parsing and type checking pg_dump output, then do a util to
   dump and type check a live database (this will lead to being able
   to run the lint process on a live database rather than source)
sort out api + do haddock
provide installation instructions for non haskell programmers
chain scopes when typechecking multiple files from util, provide api
   to do this from code


parse and/or type check todo list:
"identifier"
6.5e-5
type 'string' style type cast
[:] slice
missing keyword ops
default template1 operators should all parse
composite field selection
agg(all expr) agg(distinct expr), agg(*)
window frame clauses, named windows
parse inside string literals when cast, for common types
multidimensional arrays
implicit casting row values to composites
default values
serial
make sure can type check everything that parses
constraint names
provide list of keys in info for create/alter table: include unique
   not null and serials
type check fks, and other constraints
alter table: add/remove column
                        constraint
                        default value
                        column type
                        rename column
                        rename table
what other alters/creates
views, functions, operators, types, domains, triggers, rules
selects:
implicit joins
group by, having + group by with unaggregated and aggregated fields
distinct, on
order by - do properly
limit, offset
with queries
upto datatypes ch 8 in pg manual

stage 2:
make this useful by working on the showinfodb function:
instead of interspersing the statementinfo with the pretty printed
   statements, insert or overwrite the statementinfo comments in the
   original source so we preserve formatting and other comments.
use the type pretty printer to make the statementinfo comments more readable
want to also output types of parts of statements, e.g. the select in a
   for statement or insert, etc.. Other things that could be useful
   include adding the resolved function prototype for each function
   used which has overloads so you can see which overload is being
   called, write out the canonicalized ast: so e.g. we have all the
   casts made explicit, etc.
With this in place, the code actually has some use for real code, in
   that we can use it to easily view the types of views, etc. inline
   in the real sql source code

stage 3:
review and choose from this list:
* do null inference
* some selective fixups here and there to the typing (e.g. type
   checking constraints in create tables)
* selectively add some missing syntax, to cover the most glaring holes
* schema qualification
* type check statements inside create function
* something else from the todo for milestone 0.1 below
* something else

================================================================================

rough milestones for release 0.1:
in addition to all the stage 3 items above,

add support for nearly all syntax for parsing and type checking,
   instead of doing piecemeal bits, so go through the pg manual part
   II, support almost everything, add comprehensive simple tests, go
   through the sql reference section also. This is the time to
   document more precisely what isn't supported so there is a clear
   reference for this
do ? placeholders, and do typesafe haskell wrapper generation using this
figure out what to do about tricky operator precedence parsing, etc.
ability to type check all of chaos sql
example for generating sql code from haskell using the ast
get database loader and typesafe access generators good enough to use
   in chaos
example usage of each of these
look at the error message formatting, particularly try to fix the
   parser errors so they make more sense
add annotation field to most ast nodes, store type and source
   positioning in this field, fix parser to add lots of accurate
   positioning information when parsing.
make sure the lint process works on text dumps of databases.
try checking the sample databases: http://pgfoundry.org/projects/dbsamples/


================================================================================

some syntax todo, not organised:

------------
add support for following sql syntax (+ type checking)
alter table, common variations
create index
create rule
create trigger
+ drops for all creates
+ maybe alters?
ctes
loop, exit, labels
easy ones: transactions, savepoints, listen
prepare, execute + using
some more:
create or replace
alter table
transactions: begin, checkpoint, commit, end, rollback
cursors: declare, open, fetch, move, close, where current of
copy - parse properly
create database
create index
create rule
create trigger + plpgsql support
grant,revoke
listen, notify, unlisten
prepare, execute
savepoint, release savepoint, rollback to savepoint
set, reset
set constraints
set role
set transaction
correlated subquery attrs

plpgsql

blocks which aren't at the top level of a function
% types
strict on intos
not null for var defs
exception
execute using
get diagnostics
return query execute
raise missing bits
out params
elsif
loop
exit
labels
reverse, by in for
for in execute

expressions:
process string escapes, support dollar quoting and other quoting more
   robustly in the pretty printer
full user operator support (?)
fix expression parser properly to handle things like between - see
   grammar in pg source for info on how to do this
[:] array slices
aggregate: all and distinct
multi dimensional arrays: selectors and subscripting
missing keyword operators
datetime extract
time zone
subquery operators: any, some, all
in general, parsing operators is wrong, the lexer needs to be able to
   lex sequences of symbols into single/multiple operators correctly,
   what happens at the moment is a kludge, also, general operator
   parsing will change how operators are represented in the ast

================================================================================

some other random ideas:

null treatment
Basic motivation is to keep nulls carefully walled off, controlled,
   and be able to catch them when they sneak back into expressions,
   etc.. For each value, etc. we determine statically if it might be
   null. This can be done for return types of functions, fields in a
   select expression, etc.. (will do mappings e.g. if a functions
   inputs are all non null, then the output is non null, etc.). Once
   this is working ok, the second stage is to implement the anti null
   warnings/ errors.
Allow nulls in tables, outer joins, in coalesce, to be produced by
   selects (maybe add or remove from this allowed list, maybe make it
   configurable on a per project basis).
Never allow nulls to be an argument to a function call, (including
   ops, keyword ops, etc.). So every time you have a field being used
   in an expression and it cannot be statically verified to be non
   null, you have to insert a coalesce or fix it in some other way.
So nulls can still be used to represent optional values, n/a,
   etc.. and output to clients doing selects, but there is no need to
   grapple with:
* 3vl (or whatever it is that sql uses instead),
* what the result of a function call is if the some or all the
  arguments are null,
* what the result of a sum aggregate is if some of the values are null,
* etc.,
because none of these things are allowed.


parser, converter and pretty printer for explain output, want to view
   how a query is executed in human readable pseudocode. Add lint type
   checks, etc. to this, which can suggest ways to rewrite the query
   to get better performance. Another idea is to make the dependencies
   on the values in the tables more explicit, so you can see how much
   the data can change before another plan is chosen, or you can see a
   bad assumption about the kind of data the query will be run on.

write a replacement psql shell, which can expose parse trees, type
   checks, lint checks, and doesn't use a one line at a time style
   interface (i.e. works more like writing and executing lisp in
   emacs, not like bash).

chain scope lookups instead of unioning them since unioning is too
   slow - or maybe use maps/sets, but need to quickly scan whole lists
   e.g. for function lookup, which can't really use any sort of key
   based lookup, where the key the function lookup uses is the same as the
   key the map/set uses.

incorporate pg regression test sql into parsing and type checking
   tests

write a show for parsec errors which formats the lex tokens and
   expected lists properly (was broken when moved to the separate
   lexer)

add haddock docs to public api

write some example programs with plenty of comments - will this mainly
   be used as a library or as a utility though?

redo cabal file to add compile time options: exes, pg support, tests
 or split into separate packages?

sort out modules/folder use

work on error reporting, add tests for malformed sql

add token location info to ast nodes, modify for type checking, etc
   support.

want to report multiple parse errors, perhaps can bodge this because
   of the property that ';' can only appear inside a string or
   comment, or otherwise at the end of a statement, so add some code
   to jump to the next end of statement looking ';' and continue to
   parse to end of file in an attempt to catch at least some further
   syntax errors

improve tests:
identify each bit of syntax and make sure there is a test for it
add some bigger tests: lots of sql statements, big functions
look for possible corner cases and add tests

get property checker working again - one problem is that the pretty
   printer will reject some asts (which the parser cannot
   produce), and the parser will probably reject some invalid sql that
   the pretty printer will happily produce from some asts.

ability to write new lint checks, and choose which lint checks to use
   on a per project basis.

plpgsql on 'roids:
write libraries in haskell, and then write syntax extensions for
   plpgsql using the extension mechanism to access these libs from
   extended plpgsql e.g. ui lib written in haskell, accessed by syntax
   extensions in plpgsql then can write the database and ui all in the
   same source code in the same language, with first class support for
   properly typed relation valued expressions, avoiding multiple
   languages and mapping/'impedance mismatch' between database types
   and types in the language you write the ui in.