When to use this module

-- Hoogle documentation, generated by Haddock
-- See Hoogle, http://www.haskell.org/hoogle/


-- | DBFunctor - Functional Data Management =>  ETL/ELT Data Processing in Haskell
--   
--   Please see the README on Github at
--   <a>https://github.com/nkarag/haskell-DBFunctor</a>
@package DBFunctor
@version 0.1.1.0


-- | This is the core module that implements the relational Table concept
--   with the <a>RTable</a> data type. It defines all necessary data types
--   like <a>RTable</a> and <a>RTuple</a> as well as all the basic
--   relational algebra operations (selection -i.e., filter- , projection,
--   inner/outer join, aggregation, grouping etc.) on <a>RTable</a>s.
--   
--   <h1>When to use this module</h1>
--   
--   This module should be used whenever one has "tabular data" (e.g., some
--   CSV files, or any type of data that can be an instance of the
--   <a>RTabular</a> type class and thus define the <a>toRTable</a> and
--   <a>fromRTable</a> functions) and wants to analyze them in-memory with
--   the well-known relational algebra operations (selection, projection,
--   join, groupby, aggregations etc) that lie behind SQL. This data
--   analysis takes place within your haskell code, without the need to
--   import the data into a database (database-less data processing) and
--   the result can be turned into the original format (e.g., CSV) with a
--   simple call to the <a>fromRTable</a> function.
--   
--   <a>RTable.Core</a> gives you an interface for all common relational
--   algebra operations, which are expressed as functions over the basic
--   <a>RTable</a> data type. Of course, since each relational algebra
--   operation is a function that returns a new RTable (immutability), one
--   can compose these operations and thus express an arbitrary complex
--   query. Immutability also holds for DML operations also (e.g.,
--   <a>updateRTab</a>). This means that any update on an RTable operates
--   like a <tt>CREATE AS SELECT</tt> statement in SQL, creating a new
--   <a>RTable</a> and not modifying an existing one.
--   
--   Note that the recommended method in order to perform data analysis via
--   relational algebra operations is to use the type-level <b>Embedded
--   Domain Specific Language</b> <b>(EDSL) Julius</b>, defined in module
--   <a>Etl.Julius</a>, which exports the <a>RTable.Core</a> module. This
--   provides a standard way of expressing queries and is simpler for
--   expressing more complex queries (with many relational algebra
--   operations). Moreover it supports intermediate results (i.e.,
--   subqueries). Finally, if you need to implement some <b>ETL/ELT data
--   flows</b>, that will use the relational operations defined in
--   <a>RTable.Core</a> to analyze data but also to combine them with
--   various <b>Column Mappings</b> (<tt>RColMapping</tt>), in order to
--   achieve various data transformations, then Julius is the appropriate
--   tool for this job.
--   
--   See this <a>Julius Tutorial</a>
--   
--   <h1>Overview</h1>
--   
--   An <a>RTable</a> is logically a container of <a>RTuple</a>s (similar
--   to the concept of a Relation being a set of Tuples) and is the core
--   data type in this module. The <a>RTuple</a> is a map of (Column-Name,
--   Column-Value) pairs. A Column-Name is modeled with the
--   <a>ColumnName</a> data type, while the Column-Value is modelled with
--   the <a>RDataType</a>, which is a wrapper over the most common data
--   types that one would expect to find in a column of a Table (e.g.,
--   integers, rational numbers, strings, dates etc.).
--   
--   We said that the <a>RTable</a> is a container of <a>RTuple</a>s and
--   thus the <a>RTable</a> is a <a>Monad</a>! So one can write monadic
--   code to implement RTable operations. For example:
--   
--   <pre>
--   -- | Return an new RTable after modifying each RTuple of the input RTable.
--   myRTableOperation :: RTable -&gt; RTable
--   myRTableOperation rtab = do
--           rtup &lt;- rtab
--           let new_rtup = doStuff rtup
--           return new_rtup
--       where
--           doStuff :: RTuple -&gt; RTuple
--           doStuff = ...  -- to be defined
--   </pre>
--   
--   Many different types of data can be turned into an <a>RTable</a>. For
--   example, CSV data can be easily turn into an <a>RTable</a> via the
--   <a>toRTable</a> function. Many other types of data could be
--   represented as "tabular data" via the <a>RTable</a> data type, as long
--   as they adhere to the interface posed by the <a>RTabular</a> type
--   class. In other words, any data type that we want to convert into an
--   RTable and vice-versa, must become an instance of the <a>RTabular</a>
--   type class and thus define the basic <a>toRTable</a> and
--   <a>fromRTable</a> functions.
--   
--   <h2>An Example</h2>
--   
--   In this example we read a CSV file with the use of the
--   <tt>readCSV</tt> function from the <a>RTable.Data.CSV</a> module.
--   Then, with the use of the <a>toRTable</a> function, implemented in the
--   <a>RTabular</a> instance of the <tt>CSV</tt> data type, we convert the
--   CSV file into an <a>RTable</a>. The data of the CSV file consist of
--   metadata from an imaginary Oracle database and each row represents an
--   entry for a table stored in this database, with information (i.e.,
--   columns) pertaining to the owner of the table, the tablespace name,
--   the status of the table and various statistics, such as the number of
--   rows and number of blocks.
--   
--   In this example, we apply three "transformations" to the input data
--   and we print the result after each one, with the use of the
--   <a>printfRTable</a> function. The transfomrations are:
--   
--   <ol>
--   <li>a <a>limit</a> operation, where we return the first N number of
--   <a>RTuple</a>s,</li>
--   <li>an <a>RFilter</a> operation that returns only the tables that
--   start with a 'B', followed by a projection operation
--   (<a>RPrj</a>)</li>
--   <li>an inner-join (<a>RInJoin</a>), where we pair the <a>RTuple</a>s
--   from the previous results based on a join predicate
--   (<a>RJoinPredicate</a>): the tables that have been analyzed the same
--   day</li>
--   </ol>
--   
--   Finally, we store the results of the 2nd operation into a new CSV
--   file, with the use of the <a>fromRTable</a> function implemented for
--   the <a>RTabular</a> instance of the <tt>CSV</tt> data type.
--   
--   <pre>
--   import  RTable.Core
--   import  RTable.Data.CSV     (CSV, readCSV, toRTable)
--   import  Data.Text as T          (take, pack)
--   
--   -- This is the input source table metadata
--   src_DBTab_MData :: RTableMData
--   src_DBTab_MData = 
--       createRTableMData   (   "sourceTab"  -- table name
--                               ,[  ("OWNER", Varchar)                                      -- Owner of the table
--                                   ,("TABLE_NAME", Varchar)                                -- Name of the table
--                                   ,("TABLESPACE_NAME", Varchar)                           -- Tablespace name
--                                   ,("STATUS",Varchar)                                     -- Status of the table object (VALID/IVALID)
--                                   ,("NUM_ROWS", Integer)                                  -- Number of rows in the table
--                                   ,("BLOCKS", Integer)                                    -- Number of Blocks allocated for this table
--                                   ,("LAST_ANALYZED", Timestamp "MM<i>DD</i>YYYY HH24:MI:SS")   -- Timestamp of the last time the table was analyzed (i.e., gathered statistics) 
--                               ]
--                           )
--                           ["OWNER", "TABLE_NAME"] -- primary key
--                           [] -- (alternative) unique keys
--   
--   
--   -- Result RTable metadata
--   result_tab_MData :: RTableMData
--   result_tab_MData = 
--       createRTableMData   (   "resultTab"  -- table name
--                               ,[  ("OWNER", Varchar)                                        -- Owner of the table
--                                   ,("TABLE_NAME", Varchar)                                  -- Name of the table
--                                   ,("LAST_ANALYZED", Timestamp "MM<i>DD</i>YYYY HH24:MI:SS")   -- Timestamp of the last time the table was analyzed (i.e., gathered statistics) 
--                               ]
--                           )
--                           ["OWNER", "TABLE_NAME"] -- primary key
--                           [] -- (alternative) unique keys
--   
--   
--   main :: IO()
--   main = do
--        -- read source csv file
--       srcCSV &lt;- readCSV "./app/test-data.csv"
--   
--       putStrLn "\nHow many rows you want to print from the source table? :\n"
--       n &lt;- readLn :: IO Int    
--       
--       -- RTable A
--       printfRTable (  -- define the order by which the columns will appear on screen. Use the default column formatting.
--                       genRTupleFormat ["OWNER", "TABLE_NAME", "TABLESPACE_NAME", "STATUS", "NUM_ROWS", "BLOCKS", "LAST_ANALYZED"] genDefaultColFormatMap) $ 
--                           limit n $ toRTable src_DBTab_MData srcCSV 
--   
--       putStrLn "\nThese are the tables that start with a "B":\n"    
--       
--       -- RTable B
--       printfRTable ( genRTupleFormat ["OWNER", "TABLE_NAME","LAST_ANALYZED"] genDefaultColFormatMap) $ 
--           tabs_start_with_B $ toRTable src_DBTab_MData srcCSV 
--       
--       putStrLn "\nThese are the tables that were analyzed the same day:\n"    
--       
--       -- RTable C = A InnerJoin B
--       printfRTable ( genRTupleFormat ["OWNER", "TABLE_NAME", "LAST_ANALYZED", "OWNER_1", "TABLE_NAME_1", "LAST_ANALYZED_1"] genDefaultColFormatMap) $ 
--           ropB  myJoin
--                       (limit n $ toRTable src_DBTab_MData srcCSV) 
--                       (tabs_start_with_B $ toRTable src_DBTab_MData srcCSV)
--   
--       -- save result of 2nd operation to CSV file
--       writeCSV ".<i>app</i>result-data.csv" $ 
--                       fromRTable result_tab_MData $ 
--                           tabs_start_with_B $ 
--                               toRTable src_DBTab_MData srcCSV 
--   
--       where
--           -- Return RTuples with a table_name starting with a <tt>B</tt>
--           tabs_start_with_B :: RTable -&gt; RTable
--           tabs_start_with_B rtab = (ropU myProjection) . (ropU myFilter) $ rtab
--               where
--                   -- Create a Filter Operation to return only RTuples with table_name starting with a <tt>B</tt>
--                   myFilter = RFilter (    t -&gt;   let 
--                                                       tbname = case toText (t &lt;!&gt; "TABLE_NAME") of
--                                                                   Just t -&gt; t
--                                                                   Nothing -&gt; pack ""
--                                                   in (T.take 1 tbname) == (pack "B")
--                                       )
--                   -- Create a Projection Operation that projects only two columns
--                   myProjection = RPrj ["OWNER", "TABLE_NAME", "LAST_ANALYZED"]
--   
--           -- Create an Inner Join for tables analyzed in the same day
--           myJoin :: ROperation
--           myJoin = RInJoin (  t1 t2 -&gt; 
--                                   let
--                                       RTime {rtime = RTimestampVal {year = y1, month = m1, day = d1, hours24 = hh1, minutes = mm1, seconds = ss1}} = t1&lt;!&gt;"LAST_ANALYZED"
--                                       RTime {rtime = RTimestampVal {year = y2, month = m2, day = d2, hours24 = hh2, minutes = mm2, seconds = ss2}} = t2&lt;!&gt;"LAST_ANALYZED"
--                                   in y1 == y2 &amp;&amp; m1 == m2 &amp;&amp; d1 == d2
--                           )
--   </pre>
--   
--   And here is the output:
--   
--   <pre>
--   :l .<i>src</i>RTable/example.hs
--   :set -XOverloadedStrings
--   main
--   </pre>
--   
--   <pre>
--   How many rows you want to print from the source table? :
--   
--   10
--   ---------------------------------------------------------------------------------------------------------------------------------
--   OWNER           TABLE_NAME                        TABLESPACE_NAME     STATUS     NUM_ROWS     BLOCKS     LAST_ANALYZED
--   ~~~~~           ~~~~~~~~~~                        ~~~~~~~~~~~~~~~     ~~~~~~     ~~~~~~~~     ~~~~~~     ~~~~~~~~~~~~~
--   APEX_030200     SYS_IOT_OVER_71833                SYSAUX              VALID      0            0          06<i>08</i>2012 16:22:36
--   APEX_030200     WWV_COLUMN_EXCEPTIONS             SYSAUX              VALID      3            3          06<i>08</i>2012 16:22:33
--   APEX_030200     WWV_FLOWS                         SYSAUX              VALID      10           3          06<i>08</i>2012 22:01:21
--   APEX_030200     WWV_FLOWS_RESERVED                SYSAUX              VALID      0            0          06<i>08</i>2012 16:22:33
--   APEX_030200     WWV_FLOW_ACTIVITY_LOG1$           SYSAUX              VALID      1            29         07<i>20</i>2012 19:07:57
--   APEX_030200     WWV_FLOW_ACTIVITY_LOG2$           SYSAUX              VALID      14           29         07<i>20</i>2012 19:07:57
--   APEX_030200     WWV_FLOW_ACTIVITY_LOG_NUMBER$     SYSAUX              VALID      1            3          07<i>20</i>2012 19:08:00
--   APEX_030200     WWV_FLOW_ALTERNATE_CONFIG         SYSAUX              VALID      0            0          06<i>08</i>2012 16:22:33
--   APEX_030200     WWV_FLOW_ALT_CONFIG_DETAIL        SYSAUX              VALID      0            0          06<i>08</i>2012 16:22:33
--   APEX_030200     WWV_FLOW_ALT_CONFIG_PICK          SYSAUX              VALID      37           3          06<i>08</i>2012 16:22:33
--   
--   
--   10 rows returned
--   ---------------------------------------------------------------------------------------------------------------------------------
--   
--   These are the tables that start with a <a>B</a>:
--   
--   -------------------------------------------------------------
--   OWNER      TABLE_NAME                LAST_ANALYZED
--   ~~~~~      ~~~~~~~~~~                ~~~~~~~~~~~~~
--   DBSNMP     BSLN_BASELINES            04<i>15</i>2018 16:14:51
--   DBSNMP     BSLN_METRIC_DEFAULTS      06<i>08</i>2012 16:06:41
--   DBSNMP     BSLN_STATISTICS           04<i>15</i>2018 17:41:33
--   DBSNMP     BSLN_THRESHOLD_PARAMS     06<i>08</i>2012 16:06:41
--   SYS        BOOTSTRAP$                04<i>14</i>2014 13:53:43
--   
--   
--   5 rows returned
--   -------------------------------------------------------------
--   
--   These are the tables that were analyzed the same day:
--   
--   -------------------------------------------------------------------------------------------------------------------------------------
--   OWNER           TABLE_NAME                     LAST_ANALYZED           OWNER_1     TABLE_NAME_1              LAST_ANALYZED_1
--   ~~~~~           ~~~~~~~~~~                     ~~~~~~~~~~~~~           ~~~~~~~     ~~~~~~~~~~~~              ~~~~~~~~~~~~~~~
--   APEX_030200     SYS_IOT_OVER_71833             06<i>08</i>2012 16:22:36     DBSNMP      BSLN_THRESHOLD_PARAMS     06<i>08</i>2012 16:06:41
--   APEX_030200     SYS_IOT_OVER_71833             06<i>08</i>2012 16:22:36     DBSNMP      BSLN_METRIC_DEFAULTS      06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_COLUMN_EXCEPTIONS          06<i>08</i>2012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_COLUMN_EXCEPTIONS          06<i>08</i>2012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOWS                      06<i>08</i>2012 22:01:21     DBSNMP      BSLN_THRESHOLD_PARAMS     06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOWS                      06<i>08</i>2012 22:01:21     DBSNMP      BSLN_METRIC_DEFAULTS      06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOWS_RESERVED             06<i>08</i>2012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOWS_RESERVED             06<i>08</i>2012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOW_ALTERNATE_CONFIG      06<i>08</i>2012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOW_ALTERNATE_CONFIG      06<i>08</i>2012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOW_ALT_CONFIG_DETAIL     06<i>08</i>2012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOW_ALT_CONFIG_DETAIL     06<i>08</i>2012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOW_ALT_CONFIG_PICK       06<i>08</i>2012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06<i>08</i>2012 16:06:41
--   APEX_030200     WWV_FLOW_ALT_CONFIG_PICK       06<i>08</i>2012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06<i>08</i>2012 16:06:41
--   
--   
--   14 rows returned
--   -------------------------------------------------------------------------------------------------------------------------------------
--   </pre>
--   
--   Check the output CSV file
--   
--   <pre>
--   $ head ./app/result-data.csv
--   OWNER,TABLE_NAME,LAST_ANALYZED
--   DBSNMP,BSLN_BASELINES,04<i>15</i>2018 16:14:51
--   DBSNMP,BSLN_METRIC_DEFAULTS,06<i>08</i>2012 16:06:41
--   DBSNMP,BSLN_STATISTICS,04<i>15</i>2018 17:41:33
--   DBSNMP,BSLN_THRESHOLD_PARAMS,06<i>08</i>2012 16:06:41
--   SYS,BOOTSTRAP$,04<i>14</i>2014 13:53:43
--   </pre>
module RTable.Core

-- | Definition of the Relational Table entity An <a>RTable</a> is a
--   "container" of <a>RTuple</a>s.
type RTable = Vector RTuple

-- | Definition of the Relational Tuple. An <a>RTuple</a> is implemented as
--   a <a>HashMap</a> of (<a>ColumnName</a>, <a>RDataType</a>) pairs. This
--   ensures fast access of the column value by column name. Note that this
--   implies that the <a>RTuple</a> CANNOT have more than one columns with
--   the same name (i.e. hashmap key) and more importantly that it DOES NOT
--   have a fixed order of columns, as it is usual in RDBMS
--   implementations. This gives us the freedom to perform column change
--   operations very fast. The only place were we need fixed column order
--   is when we try to load an <a>RTable</a> from a fixed-column structure
--   such as a CSV file. For this reason, we have embedded the notion of a
--   fixed column-order in the <a>RTuple</a> metadata. See
--   <a>RTupleMData</a>.
type RTuple = HashMap ColumnName RDataType

-- | Definition of the Relational Data Type. This is the data type of the
--   values stored in each <a>RTable</a>. This is a strict data type,
--   meaning whenever we evaluate a value of type <a>RDataType</a>, there
--   must be also evaluated all the fields it contains.
data RDataType
RInt :: !Integer -> RDataType
[rint] :: RDataType -> !Integer
RText :: !Text -> RDataType
[rtext] :: RDataType -> !Text
RDate :: !Text -> !Text -> RDataType
[rdate] :: RDataType -> !Text

-- | e.g., "DD/MM/YYYY"
[dtformat] :: RDataType -> !Text
RTime :: !RTimestamp -> RDataType
[rtime] :: RDataType -> !RTimestamp
RDouble :: !Double -> RDataType
[rdouble] :: RDataType -> !Double
Null :: RDataType

-- | Basic data type to represent time. This is a strict data type, meaning
--   whenever we evaluate a value of type <a>RTimestamp</a>, there must be
--   also evaluated all the fields it contains.
data RTimestamp
RTimestampVal :: !Int -> !Int -> !Int -> !Int -> !Int -> !Int -> RTimestamp
[year] :: RTimestamp -> !Int
[month] :: RTimestamp -> !Int
[day] :: RTimestamp -> !Int
[hours24] :: RTimestamp -> !Int
[minutes] :: RTimestamp -> !Int
[seconds] :: RTimestamp -> !Int

-- | Metadata for an RTable
data RTableMData
RTableMData :: RTableName -> RTupleMData -> [ColumnName] -> [[ColumnName]] -> RTableMData

-- | Name of the <a>RTable</a>
[rtname] :: RTableMData -> RTableName

-- | Tuple-level metadata other metadata
[rtuplemdata] :: RTableMData -> RTupleMData

-- | Primary Key
[pkColumns] :: RTableMData -> [ColumnName]

-- | List of unique keys i.e., each sublist is a unique key column
--   combination
[uniqueKeys] :: RTableMData -> [[ColumnName]]

-- | Basic Metadata of an <a>RTuple</a>. The <a>RTuple</a> metadata are
--   accessed through a <a>HashMap</a> <a>ColumnName</a> <a>ColumnInfo</a>
--   structure. I.e., for each column of the <a>RTuple</a>, we access the
--   <a>ColumnInfo</a> structure to get Column-level metadata. This access
--   is achieved by <a>ColumnName</a>. However, in order to provide the
--   "impression" of a fixed column order per tuple (see <a>RTuple</a>
--   definition), we provide another <a>HashMap</a>, the <a>HashMap</a>
--   <a>ColumnOrder</a> <a>ColumnName</a>. So in the follwoing example, if
--   we want to access the <a>RTupleMData</a> tupmdata ColumnInfo by column
--   order, (assuming that we have N columns) we have to do the following:
--   
--   <pre>
--   (snd tupmdata)!((fst tupmdata)!0)
--   (snd tupmdata)!((fst tupmdata)!1)
--   ...
--   (snd tupmdata)!((fst tupmdata)!(N-1))
--   </pre>
--   
--   In the same manner in order to access the column of an <a>RTuple</a>
--   (e.g., tup) by column order, we do the following:
--   
--   <pre>
--   tup!((fst tupmdata)!0)
--   tup!((fst tupmdata)!1)
--   ...
--   tup!((fst tupmdata)!(N-1))
--   </pre>
type RTupleMData = (HashMap ColumnOrder ColumnName, HashMap ColumnName ColumnInfo)

-- | Basic metadata for a column of an RTuple
data ColumnInfo
ColumnInfo :: ColumnName -> ColumnDType -> ColumnInfo
[name] :: ColumnInfo -> ColumnName
[dtype] :: ColumnInfo -> ColumnDType
type ColumnOrder = Int

-- | Definition of the Name type
type Name = Text

-- | Definition of the Column Name
type ColumnName = Name

-- | Definition of the Table Name
type RTableName = Name

-- | This is used only for metadata purposes (see <a>ColumnInfo</a>). The
--   actual data type of a value is an RDataType The Text component of Date
--   and Timestamp data constructors is the date format e.g., "DD/MM/YYYY",
--   "DD/MM/YYYY HH24:MI:SS"
data ColumnDType
UknownType :: ColumnDType
Integer :: ColumnDType
Varchar :: ColumnDType
Date :: Text -> ColumnDType
Timestamp :: Text -> ColumnDType
Double :: ColumnDType
type Delimiter = String

-- | Basic class to represent a data type that can be turned into an
--   <a>RTable</a>. It implements the concept of "tabular data"
class RTabular a
toRTable :: RTabular a => RTableMData -> a -> RTable
fromRTable :: RTabular a => RTableMData -> RTable -> a

-- | Definition of Relational Algebra operations. These are the valid
--   operations between RTables
data ROperation
ROperationEmpty :: ROperation

-- | Union
RUnion :: ROperation

-- | Intersection
RInter :: ROperation

-- | Difference
RDiff :: ROperation

-- | Projection
RPrj :: [ColumnName] -> ROperation
[colPrjList] :: ROperation -> [ColumnName]

-- | Filter operation (an <a>RPredicate</a> can be any function of the
--   signature <tt> RTuple -&gt; Bool </tt> so it is much more powerful
--   than a typical SQL filter expression, which is a boolean expression of
--   comparison operators)
RFilter :: RPredicate -> ROperation
[fpred] :: ROperation -> RPredicate

-- | Inner Join (any type of join predicate allowed. Any function with a
--   signature of the form: <tt> RTuple -&gt; RTuple -&gt; Bool </tt> is a
--   valid join predicate. I.e., a function which returns <a>True</a> when
--   two <tt>RTuples</tt> must be paired)
RInJoin :: RJoinPredicate -> ROperation
[jpred] :: ROperation -> RJoinPredicate

-- | Left Outer Join
RLeftJoin :: RJoinPredicate -> ROperation
[jpred] :: ROperation -> RJoinPredicate

-- | Right Outer Join
RRightJoin :: RJoinPredicate -> ROperation
[jpred] :: ROperation -> RJoinPredicate

-- | Semi-Join
RSemiJoin :: RJoinPredicate -> ROperation
[jpred] :: ROperation -> RJoinPredicate

-- | Anti-Join
RAntiJoin :: RJoinPredicate -> ROperation
[jpred] :: ROperation -> RJoinPredicate

-- | Performs aggregation operations on specific columns and returns a
--   singleton RTable
RAggregate :: [RAggOperation] -> ROperation

-- | list of aggregates
[aggList] :: ROperation -> [RAggOperation]

-- | A Group By operation The SQL equivalent is: <tt> SELECT colGrByList,
--   aggList FROM... GROUP BY colGrByList </tt> Note that compared to SQL,
--   we can have a more generic grouping predicate (i.e., when two
--   <a>RTuple</a>s should belong in the same group) than just the equality
--   of values on the common columns between two <a>RTuple</a>s. Also note,
--   that in the case of an aggregation without grouping (equivalent to a
--   single-group group by), then the grouping predicate should be: <tt> _
--   _ -&gt; True </tt>
RGroupBy :: RGroupPredicate -> [RAggOperation] -> [ColumnName] -> ROperation

-- | the grouping predicate
[gpred] :: ROperation -> RGroupPredicate

-- | list of aggregates
[aggList] :: ROperation -> [RAggOperation]

-- | the Group By list of columns
[colGrByList] :: ROperation -> [ColumnName]

-- | A combination of unary <a>ROperation</a>s e.g., <tt> (p plist).(f
--   pred) (i.e., RPrj . RFilter) </tt> , in the form of an <tt> RTable
--   -&gt; RTable function. </tt> In this sense we can also include a
--   binary operation (e.g. join), if we partially apply the join to one
--   <a>RTable</a>, e.g.,
--   
--   <pre>
--   (ij jpred rtab) . (p plist) . (f pred)
--   </pre>
RCombinedOp :: UnaryRTableOperation -> ROperation
[rcombOp] :: ROperation -> UnaryRTableOperation

-- | A generic binary <a>ROperation</a>.
RBinOp :: BinaryRTableOperation -> ROperation
[rbinOp] :: ROperation -> BinaryRTableOperation

-- | Order the <a>RTuple</a>s of the <a>RTable</a> acocrding to the
--   specified list of Columns. First column in the input list has the
--   highest priority in the sorting order.
ROrderBy :: [(ColumnName, OrderingSpec)] -> ROperation
[colOrdList] :: ROperation -> [(ColumnName, OrderingSpec)]

-- | A generic unary operation on a RTable
type UnaryRTableOperation = RTable -> RTable

-- | A generic binary operation on RTable
type BinaryRTableOperation = RTable -> RTable -> RTable

-- | This data type represents all possible aggregate operations over an
--   RTable. Examples are : Sum, Count, Average, Min, Max but it can be any
--   other "aggregation". The essential property of an aggregate operation
--   is that it acts on an RTable (or on a group of RTuples - in the case
--   of the RGroupBy operation) and produces a single RTuple.
--   
--   An aggregate operation is applied on a specific column (source column)
--   and the aggregated result will be stored in the target column. It is
--   important to understand that the produced aggregated RTuple is
--   different from the input RTuples. It is a totally new RTuple, that
--   will consist of the aggregated column(s) (and the grouping columns in
--   the case of an RGroupBy).
data RAggOperation
RAggOperation :: ColumnName -> ColumnName -> (RTable -> RTuple) -> RAggOperation

-- | Source column
[sourceCol] :: RAggOperation -> ColumnName

-- | Target column
[targetCol] :: RAggOperation -> ColumnName

-- | here we define the aggegate function to be applied on an RTable
[aggFunc] :: RAggOperation -> RTable -> RTuple

-- | Aggregation Function type. An aggregation function receives as input a
--   source column (i.e., a <a>ColumnName</a>) of a source <a>RTable</a>
--   and returns an aggregated value, which is the result of the
--   aggregation on the values of the source column.
type AggFunction = ColumnName -> RTable -> RDataType

-- | Returns an <a>RAggOperation</a> with a custom aggregation function
--   provided as input
raggGenericAgg :: AggFunction -> ColumnName -> ColumnName -> RAggOperation

-- | The Sum aggregate operation
raggSum :: ColumnName -> ColumnName -> RAggOperation

-- | The Count aggregate operation Count aggregation (no distinct)
raggCount :: ColumnName -> ColumnName -> RAggOperation

-- | The CountDist aggregate operation Count distinct aggregation (i.e.,
--   <tt>count(distinct col)</tt> in SQL). Returns the distinct number of
--   values for this column.
raggCountDist :: ColumnName -> ColumnName -> RAggOperation

-- | The CountStar aggregate operation Returns the number of <a>RTuple</a>s
--   in the <a>RTable</a> (i.e., <tt>count(*)</tt> in SQL)
raggCountStar :: ColumnName -> RAggOperation

-- | The Average aggregate operation
raggAvg :: ColumnName -> ColumnName -> RAggOperation

-- | The Max aggregate operation
raggMax :: ColumnName -> ColumnName -> RAggOperation

-- | The Min aggregate operation
raggMin :: ColumnName -> ColumnName -> RAggOperation

-- | The StrAgg aggregate operation This is known as "string_agg"" in
--   Postgresql and "listagg" in Oracle. It aggregates the values of a text
--   <a>RDataType</a> column with a specified delimiter
raggStrAgg :: ColumnName -> ColumnName -> Delimiter -> RAggOperation

-- | A Predicate. It defines an arbitrary condition over the columns of an
--   <a>RTuple</a>. It is used primarily in the filter <a>RFilter</a>
--   operation and used in the filter function <a>f</a>.
type RPredicate = RTuple -> Bool

-- | The Group By Predicate It defines the condition for two <a>RTuple</a>s
--   to be included in the same group.
type RGroupPredicate = RTuple -> RTuple -> Bool

-- | The Join Predicate. It defines when two <a>RTuple</a>s should be
--   paired.
type RJoinPredicate = RTuple -> RTuple -> Bool

-- | The Upsert Predicate. It defines when two <a>RTuple</a>s should be
--   paired in a merge operation. The matching predicate must be applied on
--   a specific set of matching columns. The source <a>RTable</a> in the
--   Upsert operation must return a unique set of <a>RTuple</a>s, if
--   grouped by this set of matching columns. Otherwise an exception
--   (<a>UniquenessViolationInUpsert</a>) is thrown.
data RUpsertPredicate
RUpsertPredicate :: [ColumnName] -> (RTuple -> RTuple -> Bool) -> RUpsertPredicate
[matchCols] :: RUpsertPredicate -> [ColumnName]
[matchPred] :: RUpsertPredicate -> RTuple -> RTuple -> Bool

-- | Execute a Unary ROperation
runUnaryROperation :: ROperation -> RTable -> RTable

-- | ropU operator executes a unary ROperation. A short name for the
--   <a>runUnaryROperation</a> function
ropU :: ROperation -> RTable -> RTable

-- | Execute a Unary ROperation and return an <a>RTabResult</a>
runUnaryROperationRes :: ROperation -> RTable -> RTabResult

-- | ropUres operator executes a unary ROperation. A short name for the
--   <a>runUnaryROperationRes</a> function
ropUres :: ROperation -> RTable -> RTabResult

-- | Execute a Binary ROperation
runBinaryROperation :: ROperation -> RTable -> RTable -> RTable

-- | ropB operator executes a binary ROperation. A short name for the
--   <a>runBinaryROperation</a> function
ropB :: ROperation -> RTable -> RTable -> RTable

-- | Execute a Binary ROperation and return an <a>RTabResult</a>
runBinaryROperationRes :: ROperation -> RTable -> RTable -> RTabResult

-- | ropBres operator executes a binary ROperation. A short name for the
--   <a>runBinaryROperationRes</a> function
ropBres :: ROperation -> RTable -> RTable -> RTabResult

-- | Number of RTuples returned by an RTable operation
type RTuplesRet = Sum Int

-- | RTabResult is the result of an RTable operation and is a Writer Monad,
--   that includes the new RTable, as well as the number of RTuples
--   returned by the operation.
type RTabResult = Writer RTuplesRet RTable

-- | Creates an RTabResult (i.e., a Writer Monad) from a result RTable and
--   the number of RTuples that it returned
rtabResult :: (RTable, RTuplesRet) -> RTabResult

-- | Returns the info "stored" in the RTabResult Writer Monad
runRTabResult :: RTabResult -> (RTable, RTuplesRet)

-- | Returns the "log message" in the RTabResult Writer Monad, which is the
--   number of returned RTuples
execRTabResult :: RTabResult -> RTuplesRet

-- | Creates an RTuplesRet type
rtuplesRet :: Int -> RTuplesRet

-- | Return the number embedded in the RTuplesRet data type
getRTuplesRet :: RTuplesRet -> Int

-- | Function composition.
(.) :: () => (b -> c) -> (a -> b) -> a -> c
infixr 9 .

-- | Right-to-left composition of Kleisli arrows.
--   <tt>(<a>&gt;=&gt;</a>)</tt>, with the arguments flipped.
--   
--   Note how this operator resembles function composition
--   <tt>(<a>.</a>)</tt>:
--   
--   <pre>
--   (.)   ::            (b -&gt;   c) -&gt; (a -&gt;   b) -&gt; a -&gt;   c
--   (&lt;=&lt;) :: Monad m =&gt; (b -&gt; m c) -&gt; (a -&gt; m b) -&gt; a -&gt; m c
--   </pre>
(<=<) :: Monad m => (b -> m c) -> (a -> m b) -> a -> m c
infixr 1 <=<

-- | Executes an RFilter operation
runRfilter :: RPredicate -> RTable -> RTable

-- | Filter (i.e. selection operator). A short name for the
--   <tt>runRFilter</tt> function
f :: RPredicate -> RTable -> RTable

-- | Implements an Inner Join operation between two RTables (any type of
--   join predicate is allowed) This Inner Join implementation follows
--   Oracle DB's convention for common column names. When we have two
--   tuples t1 and t2 with a common column name (lets say "Common"), then
--   the resulting tuple after a join will be "Common", "Common_1", so a
--   "_1" suffix is appended. The tuple from the left table by convention
--   retains the original column name. So "Column_1" is the column from the
--   right table. If "Column_1" already exists, then "Column_2" is used.
runInnerJoinO :: RJoinPredicate -> RTable -> RTable -> RTable

-- | <a>RTable</a> Inner Join Operator. A short name for the
--   <a>runInnerJoinO</a> function
iJ :: RJoinPredicate -> RTable -> RTable -> RTable

-- | Implements a Left Outer Join operation between two RTables (any type
--   of join predicate is allowed), i.e., the rows of the left RTable will
--   be preserved. Note that when dublicate keys encountered that is, since
--   the underlying structure for an RTuple is a Data.HashMap.Strict, only
--   one value per key is allowed. So in the context of joining two RTuples
--   the value of the left RTuple on the common key will be prefered.
--   
--   Implements a Left Outer Join operation between two RTables (any type
--   of join predicate is allowed), i.e., the rows of the left RTable will
--   be preserved. A Left Join : <tt> tabLeft LEFT JOIN tabRight ON
--   joinPred </tt> where tabLeft is the preserving table can be defined
--   as: the Union between the following two RTables:
--   
--   <ul>
--   <li>The result of the inner join: tabLeft INNER JOIN tabRight ON
--   joinPred</li>
--   <li>The rows from the preserving table (tabLeft) that DONT satisfy the
--   join condition, enhanced with the columns of tabRight returning Null
--   values.</li>
--   </ul>
--   
--   The common columns will appear from both tables but only the left
--   table column's will retain their original name.
runLeftJoin :: RJoinPredicate -> RTable -> RTable -> RTable

-- | RTable Left Outer Join Operator. A short name for the
--   <a>runLeftJoin</a> function
lJ :: RJoinPredicate -> RTable -> RTable -> RTable

-- | Implements a Right Outer Join operation between two RTables (any type
--   of join predicate is allowed), i.e., the rows of the right RTable will
--   be preserved. A Right Join : <tt> tabLeft RIGHT JOIN tabRight ON
--   joinPred </tt> where tabRight is the preserving table can be defined
--   as: the Union between the following two RTables:
--   
--   <ul>
--   <li>The result of the inner join: tabLeft INNER JOIN tabRight ON
--   joinPred</li>
--   <li>The rows from the preserving table (tabRight) that DONT satisfy
--   the join condition, enhanced with the columns of tabLeft returning
--   Null values.</li>
--   </ul>
--   
--   The common columns will appear from both tables but only the right
--   table column's will retain their original name.
runRightJoin :: RJoinPredicate -> RTable -> RTable -> RTable

-- | RTable Right Outer Join Operator. A short name for the
--   <a>runRightJoin</a> function
rJ :: RJoinPredicate -> RTable -> RTable -> RTable

-- | Implements a Full Outer Join operation between two RTables (any type
--   of join predicate is allowed) A full outer join is the union of the
--   left and right outer joins respectively. The common columns will
--   appear from both tables but only the left table column's will retain
--   their original name (just by convention).
runFullOuterJoin :: RJoinPredicate -> RTable -> RTable -> RTable

-- | Implements a Right Outer Join operation between two RTables (any type
--   of join predicate is allowed) i.e., the rows of the right RTable will
--   be preserved. Note that when dublicate keys encountered that is, since
--   the underlying structure for an RTuple is a Data.HashMap.Strict, only
--   one value per key is allowed. So in the context of joining two RTuples
--   the value of the right RTuple on the common key will be prefered.
--   
--   RTable Full Outer Join Operator. A short name for the
--   <a>runFullOuterJoin</a> function
foJ :: RJoinPredicate -> RTable -> RTable -> RTable

-- | <a>RTable</a> semi-join operator. A short name for the
--   <a>runSemiJoin</a> function
sJ :: RJoinPredicate -> RTable -> RTable -> RTable

-- | Implements the semi-Join operation between two RTables (any type of
--   join predicate is allowed) It returns the <a>RTuple</a>s from the left
--   <a>RTable</a> that match with the right <a>RTable</a>. Note that if an
--   <a>RTuple</a> from the left <a>RTable</a> matches more than one
--   <a>RTuple</a>s from the right <a>RTable</a> the semi join operation
--   will return only a single <a>RTuple</a>.
runSemiJoin :: RJoinPredicate -> RTable -> RTable -> RTable

-- | <a>RTable</a> anti-join operator. A short name for the
--   <a>runAntiJoin</a> function
aJ :: RJoinPredicate -> RTable -> RTable -> RTable

-- | Implements the anti-Join operation between two RTables (any type of
--   join predicate is allowed) It returns the <a>RTuple</a>s from the left
--   <a>RTable</a> that DONT match with the right <a>RTable</a>.
runAntiJoin :: RJoinPredicate -> RTable -> RTable -> RTable

-- | Joins two RTuples into one. In this join we follow Oracle DB's
--   convention when joining two tuples with some common column names. When
--   we have two tuples t1 and t2 with a common column name (lets say
--   <a>Common</a>), then the resulitng tuple after a join will be
--   <a>Common</a>, <a>Common_1</a>, so a "_1" suffix is appended. The
--   tuple from the left table by convention retains the original column
--   name. So <a>Column_1</a> is the column from the right table. If
--   <a>Column_1</a> already exists, then <a>Column_2</a> is used.
joinRTuples :: RTuple -> RTuple -> RTuple

-- | Implements the union of two RTables as a union of two lists (see
--   <a>List</a>). Duplicates, and elements of the first list, are removed
--   from the the second list, but if the first list contains duplicates,
--   so will the result
--   
--   Implements the union of two RTables. Note that dublicate <a>RTuple</a>
--   elimination takes places.
runUnion :: RTable -> RTable -> RTable

-- | Implements the union-all of two RTables. I.e., a union without
--   dublicate <a>RTuple</a> elimination. Runs in O(m+n).
runUnionAll :: RTable -> RTable -> RTable

-- | RTable Union Operator. A short name for the <a>runUnion</a> function
u :: RTable -> RTable -> RTable

-- | Implements the intersection of two RTables
runIntersect :: RTable -> RTable -> RTable

-- | RTable Intersection Operator. A short name for the <a>runIntersect</a>
--   function
i :: RTable -> RTable -> RTable

-- | Implements the set Difference of two RTables as the diff of two lists
--   (see <a>List</a>).
runDiff :: RTable -> RTable -> RTable

-- | RTable Difference Operator. A short name for the <a>runDiff</a>
--   function
d :: RTable -> RTable -> RTable

-- | Implements RTable projection operation. If a column name does not
--   exist, then an empty RTable is returned.
runProjection :: [ColumnName] -> RTable -> RTable

-- | Implements RTable projection operation. If a column name does not
--   exist, then the returned RTable includes this column with a Null
--   value. This projection implementation allows missed hits.
runProjectionMissedHits :: [ColumnName] -> RTable -> RTable

-- | RTable Projection operator. A short name for the <a>runProjection</a>
--   function
p :: [ColumnName] -> RTable -> RTable

-- | Implements the aggregation operation on an RTable It aggregates the
--   specific columns in each AggOperation and returns a singleton RTable
--   i.e., an RTable with a single RTuple that includes only the agg
--   columns and their aggregated value.
runAggregation :: [RAggOperation] -> RTable -> RTable

-- | Aggregation Operator. A short name for the <a>runAggregation</a>
--   function
rAgg :: [RAggOperation] -> RTable -> RTable

-- | Implements the GROUP BY operation over an <a>RTable</a>.
runGroupBy :: RGroupPredicate -> [RAggOperation] -> [ColumnName] -> RTable -> RTable

-- | Group By Operator. A short name for the <a>runGroupBy</a> function
rG :: RGroupPredicate -> [RAggOperation] -> [ColumnName] -> RTable -> RTable

-- | Implement a grouping operation over an <a>RTable</a>. No aggregation
--   takes place. It returns the individual groups as separate
--   <a>RTable</a>s in a list. In total the initial set of <a>RTuple</a>s
--   is retained. If an empty <a>RTable</a> is provided as input, then a
--   ["empty RTable"] is returned.
groupNoAggList :: RGroupPredicate -> [ColumnName] -> RTable -> [RTable]

-- | Implement a grouping operation over an <a>RTable</a>. No aggregation
--   takes place. The output <a>RTable</a> has exactly the same
--   <a>RTuple</a>s, as the input, but these are grouped based on the input
--   grouping predicate. If an empty <a>RTable</a> is provided as input,
--   then an empty <a>RTable</a> is returned.
groupNoAgg :: RGroupPredicate -> [ColumnName] -> RTable -> RTable

-- | Implements the ORDER BY operation. First column in the input list has
--   the highest priority in the sorting order We treat Null as the maximum
--   value (anything compared to Null is smaller). This way Nulls are send
--   at the end (i.e., "Nulls Last" in SQL parlance). This is for Asc
--   ordering. For Desc ordering, we have the opposite. Nulls go first and
--   so anything compared to Null is greater. @ SQL example with q as
--   (select case when level &lt; 4 then level else NULL end c1 -- , level
--   c2 from dual connect by level &lt; 7 ) select * from q order by c1
--   
--   C1 ---- 1 2 3 Null Null Null
--   
--   with q as (select case when level &lt; 4 then level else NULL end c1
--   -- , level c2 from dual connect by level &lt; 7 ) select * from q
--   order by c1 desc
runOrderBy :: [(ColumnName, OrderingSpec)] -> RTable -> RTable

-- | Order By Operator. A short name for the <a>runOrderBy</a> function
rO :: [(ColumnName, OrderingSpec)] -> RTable -> RTable

-- | runCombinedROp: A Higher Order function that accepts as input a
--   combination of unary ROperations e.g., (p plist).(f pred) expressed in
--   the form of a function (RTable -&gt; Rtable) and applies this function
--   to the input RTable. In this sense we can also include a binary
--   operation (e.g. join), if we partially apply the join to one RTable
--   e.g., (ij jpred rtab) . (p plist) . (f pred)
runCombinedROp :: (RTable -> RTable) -> RTable -> RTable

-- | A short name for the <a>runCombinedROp</a> function
rComb :: (RTable -> RTable) -> RTable -> RTable
data IgnoreDefault
Ignore :: IgnoreDefault
NotIgnore :: IgnoreDefault

-- | It receives an RTable, a search value and a default value. It returns
--   a new RTable which is identical to the source one but for each RTuple,
--   for the specified column: if the search value was found then the
--   specified Return Value is returned else the default value is returned
--   (if the ignore indicator is not set), otherwise (if the ignore
--   indicator is set), it returns the existing value for the column for
--   each <a>RTuple</a>. If you pass an empty RTable, then it returns an
--   empty RTable Throws a <a>ColumnDoesNotExist</a> exception, if the
--   column does not exist
decodeRTable :: ColumnName -> RDataType -> RDataType -> RDataType -> IgnoreDefault -> RTable -> RTable

-- | It receives an RTuple and lookups the value at a specfic column name.
--   Then it compares this value with the specified search value. If it is
--   equal to the search value then it returns the specified Return Value.
--   If not, then it returns the specified default Value, if the ignore
--   indicator is not set, otherwise (if the ignore indicator is set) it
--   returns the existing value. If you pass an empty RTuple, then it
--   returns Null. Throws a <a>ColumnDoesNotExist</a> exception, if this
--   map contains no mapping for the key.
decodeColValue :: ColumnName -> RDataType -> RDataType -> RDataType -> IgnoreDefault -> RTuple -> RDataType

-- | Returns an <a>RTimestamp</a> from an input <a>String</a> and a format
--   <a>String</a>.
--   
--   Valid format patterns are:
--   
--   <ul>
--   <li>For year: <tt>YYYY</tt>, e.g., <tt>"0001"</tt>,
--   <tt>"2018"</tt></li>
--   <li>For month: <tt>MM</tt>, e.g., <tt>"01"</tt>, <tt>"1"</tt>,
--   <tt>"12"</tt></li>
--   <li>For day: <tt>DD</tt>, e.g., <tt>"01"</tt>, <tt>"1"</tt>,
--   <tt>"31"</tt></li>
--   <li>For hours: <tt>HH</tt>, <tt>HH24</tt> e.g., <tt>"00"</tt>,
--   <tt>"23"</tt> I.e., hours must be specified in 24 format</li>
--   <li>For minutes: <tt>MI</tt>, e.g., <tt>"01"</tt>, <tt>"1"</tt>,
--   <tt>"59"</tt></li>
--   <li>For seconds: <tt>SS</tt>, e.g., <tt>"01"</tt>, <tt>"1"</tt>,
--   <tt>"59"</tt></li>
--   </ul>
--   
--   Example of a typical format string is: <tt>"DD/MM/YYYY HH:MI:SS</tt>
--   
--   If no valid format pattern is found then an
--   <a>UnsupportedTimeStampFormat</a> exception is thrown
toRTimestamp :: String -> String -> RTimestamp

-- | Creates an RTimestamp data type from an input timestamp format string
--   and a timestamp value represented as a <a>String</a>. Valid format
--   patterns are:
--   
--   <ul>
--   <li>For year: <tt>YYYY</tt>, e.g., <tt>"0001"</tt>,
--   <tt>"2018"</tt></li>
--   <li>For month: <tt>MM</tt>, e.g., <tt>"01"</tt>, <tt>"1"</tt>,
--   <tt>"12"</tt></li>
--   <li>For day: <tt>DD</tt>, e.g., <tt>"01"</tt>, <tt>"1"</tt>,
--   <tt>"31"</tt></li>
--   <li>For hours: <tt>HH</tt>, <tt>HH24</tt> e.g., <tt>"00"</tt>,
--   <tt>"23"</tt> I.e., hours must be specified in 24 format</li>
--   <li>For minutes: <tt>MI</tt>, e.g., <tt>"01"</tt>, <tt>"1"</tt>,
--   <tt>"59"</tt></li>
--   <li>For seconds: <tt>SS</tt>, e.g., <tt>"01"</tt>, <tt>"1"</tt>,
--   <tt>"59"</tt></li>
--   </ul>
--   
--   Example of a typical format string is: <tt>"DD/MM/YYYY HH:MI:SS</tt>
--   
--   If no valid format pattern is found then an
--   <a>UnsupportedTimeStampFormat</a> exception is thrown
createRTimestamp :: String -> String -> RTimestamp

-- | Convert an <a>RTimestamp</a> value to a Universal Time value
--   (<a>UTCTime</a>)
toUTCTime :: RTimestamp -> UTCTime

-- | Convert a Universal Time value (<a>UTCTime</a>) to an
--   <a>RTimestamp</a> value
fromUTCTime :: UTCTime -> RTimestamp

-- | rTimeStampToText: converts an RTimestamp value to RText Valid input
--   formats are:
--   
--   <ul>
--   <li>1. <tt> "DD/MM/YYYY HH24:MI:SS" </tt></li>
--   <li>2. <tt> "YYYYMMDD-HH24.MI.SS" </tt></li>
--   <li>3. <tt> "YYYYMMDD" </tt></li>
--   <li>4. <tt> "YYYYMM" </tt></li>
--   <li>5. <tt> "YYYY" </tt></li>
--   </ul>
rTimestampToRText :: String -> RTimestamp -> RDataType

-- | Standard timestamp format. For example: "DD<i>MM</i>YYYY HH24:MI:SS"
stdTimestampFormat :: String

-- | Standard date format
stdDateFormat :: [Char]

-- | Search for the first occurence of a substring within a <a>RText</a>
--   string and return the 1st character position, or <a>Nothing</a> if the
--   substring is not found, or if an non-text <a>RDataType</a>, is given
--   as input.
instrRText :: RDataType -> RDataType -> Maybe Int

-- | Search for the first occurence of a substring within a <a>String</a>
--   and return the 1st character position, or <a>Nothing</a> if the
--   substring is not found.
instr :: Eq a => [a] -> [a] -> Maybe Int

-- | Search for the first occurence of a substring within a <a>Text</a>
--   string and return the 1st character position, or <a>Nothing</a> if the
--   substring is not found.
instrText :: Text -> Text -> Maybe Int

-- | Concatenates two Text <tt>RDataTypes</tt>, in all other cases of
--   <a>RDataType</a> it returns <a>Null</a>.
rdtappend :: RDataType -> RDataType -> RDataType

-- | stripRText : O(n) Remove leading and trailing white space from a
--   string. If the input RDataType is not an RText, then Null is returned
stripRText :: RDataType -> RDataType

-- | Helper function to remove a character around (from both beginning and
--   end) of an (RText t) value
removeCharAroundRText :: Char -> RDataType -> RDataType

-- | Returns <a>True</a> only if this is an <a>RText</a>
isText :: RDataType -> Bool

-- | It receives an RTable and a default value. It returns a new RTable
--   which is identical to the source one but for each RTuple, for the
--   specified column every Null value in every RTuple has been replaced by
--   a default value If you pass an empty RTable, then it returns an empty
--   RTable Throws a <a>ColumnDoesNotExist</a> exception, if the column
--   does not exist
nvlRTable :: ColumnName -> RDataType -> RTable -> RTable

-- | It receives an RTuple and a default value. It returns a new RTuple
--   which is identical to the source one but every Null value in the
--   specified colummn has been replaced by a default value
nvlRTuple :: ColumnName -> RDataType -> RTuple -> RTuple

-- | Returns <a>True</a> if the input <a>RTuple</a> is a Null RTuple,
--   otherwise it returns <a>False</a> Note that a Null RTuple has all its
--   values equal with <a>Null</a> but it still has columns. This is
--   different from an empty <a>RTuple</a>, which is an <a>RTuple</a> withi
--   no columns and no values whatsoever. See <a>isRTupEmpty</a>.
isNullRTuple :: RTuple -> Bool

-- | Use this function to compare an RDataType with the Null value because
--   due to Null logic x == Null or x /= Null, will always return False. It
--   returns True if input value is Null
isNull :: RDataType -> Bool

-- | Use this function to compare an RDataType with the Null value because
--   deu to Null logic x == Null or x /= Null, will always return False. It
--   returns True if input value is Not Null
isNotNull :: RDataType -> Bool

-- | Returns the 1st parameter if this is not Null, otherwise it returns
--   the 2nd.
nvl :: RDataType -> RDataType -> RDataType

-- | Returns the value of a specific column (specified by name) if this is
--   not Null. If this value is Null, then it returns the 2nd parameter. If
--   you pass an empty RTuple, then it returns Null. Throws a
--   <a>ColumnDoesNotExist</a> exception, if this map contains no mapping
--   for the key.
nvlColValue :: ColumnName -> RDataType -> RTuple -> RDataType

-- | Test whether an RTable is empty
isRTabEmpty :: RTable -> Bool

-- | Get the first RTuple from an RTable
headRTup :: RTable -> RTuple

-- | returns the N first <a>RTuple</a>s of an <a>RTable</a>
limit :: Int -> RTable -> RTable

-- | Test whether an RTuple is empty
isRTupEmpty :: RTuple -> Bool

-- | getRTupColValue :: Returns the value of an RTuple column based on the
--   ColumnName key if the column name is not found, then it returns Null.
--   !!!Note that this might be confusing since there might be an existing
--   column name with a Null value!!!
getRTupColValue :: ColumnName -> RTuple -> RDataType

-- | Returns the value of an RTuple column based on the ColumnName key if
--   the column name is not found, then it returns Nothing
rtupLookup :: ColumnName -> RTuple -> Maybe RDataType

-- | Returns the value of an RTuple column based on the ColumnName key if
--   the column name is not found, then it returns a default value
rtupLookupDefault :: RDataType -> ColumnName -> RTuple -> RDataType

-- | Operator for getting a column value from an RTuple Throws a
--   <a>ColumnDoesNotExist</a> exception, if this map contains no mapping
--   for the key.
(<!>) :: RTuple -> ColumnName -> RDataType

-- | Safe Operator for getting a column value from an RTuple if the column
--   name is not found, then it returns Nothing
(<!!>) :: RTuple -> ColumnName -> Maybe RDataType

-- | Turns an <a>RTable</a> to a list of <a>RTuple</a>s
rtableToList :: RTable -> [RTuple]

-- | Concatenates a list of <a>RTable</a>s to a single RTable. Essentially,
--   it unions (see <a>runUnion</a>) all <a>RTable</a>s of the list.
concatRTab :: [RTable] -> RTable

-- | Turns an RTuple to a List
rtupleToList :: RTuple -> [(ColumnName, RDataType)]

-- | toListRDataType: returns a list of RDataType values of an RTuple, in
--   the fixed column order of the RTuple
toListRDataType :: RTupleMData -> RTuple -> [RDataType]

-- | Return the Text out of an RDataType If a non-text RDataType is given
--   then Nothing is returned.
toText :: RDataType -> Maybe Text

-- | Return an <a>RDataType</a> from <a>Text</a>
fromText :: Text -> RDataType

-- | Map function over an <a>RTable</a>.
rtabMap :: (RTuple -> RTuple) -> RTable -> RTable

-- | This is a fold operation on a <a>RTable</a> that returns an
--   <a>RTable</a>. It is similar with : <tt> foldr' :: (a -&gt; b -&gt; b)
--   -&gt; b -&gt; Vector a -&gt; b </tt> of Vector, which is an O(n) Right
--   fold with a strict accumulator
rtabFoldr' :: (RTuple -> RTable -> RTable) -> RTable -> RTable -> RTable

-- | This is a fold operation on <a>RTable</a> that returns an
--   <a>RTable</a>. It is similar with : <tt> foldl' :: (a -&gt; b -&gt; a)
--   -&gt; a -&gt; Vector b -&gt; a </tt> of Vector, which is an O(n) Left
--   fold with a strict accumulator
rtabFoldl' :: (RTable -> RTuple -> RTable) -> RTable -> RTable -> RTable

-- | O(n) Transform this <a>RTuple</a> by applying a function to every
--   value
rtupleMap :: (RDataType -> RDataType) -> RTuple -> RTuple

-- | O(n) Transform this <a>RTuple</a> by applying a function to every
--   value
rtupleMapWithKey :: (ColumnName -> RDataType -> RDataType) -> RTuple -> RTuple

-- | This is a fold operation on a <a>RTable</a> that returns an
--   <a>RDataType</a> value. It is similar with : <tt> foldr' :: (a -&gt; b
--   -&gt; b) -&gt; b -&gt; Vector a -&gt; b </tt> of Vector, which is an
--   O(n) Right fold with a strict accumulator
rdatatypeFoldr' :: (RTuple -> RDataType -> RDataType) -> RDataType -> RTable -> RDataType

-- | This is a fold operation on <a>RTable</a> that returns an
--   <a>RDataType</a> value It is similar with : <tt> foldl' :: (a -&gt; b
--   -&gt; a) -&gt; a -&gt; Vector b -&gt; a </tt> of Vector, which is an
--   O(n) Left fold with a strict accumulator
rdatatypeFoldl' :: (RDataType -> RTuple -> RDataType) -> RDataType -> RTable -> RDataType

-- | O(n) append an RTuple to an RTable Please note that this is an
--   <b>immutable</b> implementation of an <a>RTable</a> insert. This
--   simply means that the insert operation returns a new <a>RTable</a> and
--   does not affect the original <a>RTable</a>.
insertAppendRTab :: RTuple -> RTable -> RTable

-- | O(n) prepend an RTuple to an RTable Please note that this is an
--   <b>immutable</b> implementation of an <a>RTable</a> insert. This
--   simply means that the insert operation returns a new <a>RTable</a> and
--   does not affect the original <a>RTable</a>.
insertPrependRTab :: RTuple -> RTable -> RTable

-- | Insert an <a>RTable</a> to an existing <a>RTable</a>. This is
--   equivalent to an <tt>INSERT INTO SELECT</tt> caluse in SQL. We want to
--   insert into an <a>RTable</a> the results of a "subquery", which in our
--   case is materialized via the input <a>RTable</a>. Please note that
--   this is an <b>immutable</b> implementation of an <a>RTable</a> insert.
--   This simply means that the insert operation returns a new
--   <a>RTable</a> and does not affect the original <a>RTable</a>. Also
--   note that the source and target <a>RTable</a>s should have the same
--   structure. By "structure", we mean that the <a>ColumnName</a>s and the
--   corresponding data types must match. Essentially what we record in the
--   <a>ColumnInfo</a> must be the same for the two <a>RTable</a>s.
--   Otherwise a <a>ConflictingRTableStructures</a> exception will be
--   thrown.
insertRTabToRTab :: RTable -> RTable -> RTable

-- | Delete <a>RTuple</a>s from an <a>RTable</a> based on an
--   <a>RPredicate</a>. Please note that this is an <b>immutable</b>
--   implementation of an <a>RTable</a> update. This simply means that the
--   delete operation returns a new <a>RTable</a>. So, the original
--   <a>RTable</a> remains unchanged and no deletion in-place takes place
--   whatsoever. Moreover, if we have multiple threads deleting an
--   <a>RTable</a>, due to immutability, each thread "sees" its own copy of
--   the <a>RTable</a> and thus there is no need for locking the deleted
--   <a>RTuple</a>s, as happens in a common RDBMS.
deleteRTab :: RPredicate -> RTable -> RTable

-- | Update an RTable. The input includes a list of (ColumnName, new Value)
--   pairs. Also a filter predicate is specified, in order to restrict the
--   update only to those <a>RTuple</a>s that fulfill the predicate. Please
--   note that this is an <b>immutable</b> implementation of an
--   <a>RTable</a> update. This simply means that the update operation
--   returns a new <a>RTable</a> that includes all the <a>RTuple</a>s of
--   the original <a>RTable</a>, both the ones that have been updated and
--   the others that have not. So, the original <a>RTable</a> remains
--   unchanged and no update in-place takes place whatsoever. Moreover, if
--   we have multiple threads updating an <a>RTable</a>, due to
--   immutability, each thread "sees" its own copy of the <a>RTable</a> and
--   thus there is no need for locking the updated <a>RTuple</a>s, as
--   happens in a common RDBMS.
updateRTab :: [(ColumnName, RDataType)] -> RPredicate -> RTable -> RTable

-- | Upsert (Update+Insert, aka Merge) Operation. We provide a source
--   <a>RTable</a> and a matching condition (<a>RUpsertPredicate</a>) to
--   the <a>RTuple</a>s of the target <a>RTable</a>. An <a>RTuple</a> from
--   the target <a>RTable</a> might match to a single only <a>RTuple</a> in
--   the source <a>RTable</a>, or not match at all. If it is matched to
--   more than one <a>RTuple</a>s then an exception
--   (<a>UniquenessViolationInUpsert</a>) is thrown. When an <a>RTuple</a>
--   from the target <a>RTable</a> is matched to a source <a>RTuple</a>,
--   then the corresponding columns of the target <a>RTuple</a> are updated
--   with the new values provided in the source <a>RTuple</a>. This takes
--   place for the target <a>RTuple</a>s that match but also that satisfy
--   the input <a>RPredicate</a>. Thus we can restrict further with a
--   filter the <a>RTuple</a>s of the target <a>RTable</a> where the update
--   will take place. Finally, the source <a>RTuple</a>s that did not match
--   to the target <a>RTable</a>, are inserted (appended) to the target
--   <a>RTable</a>
--   
--   Please note that this is an <b>immutable</b> implementation of an
--   <a>RTable</a> upsert. This simply means that the upsert operation
--   returns a new <a>RTable</a> and does not affect the original
--   <a>RTable</a>. Moreover, if we have multiple threads updating an
--   <a>RTable</a>, due to immutability, each thread "sees" its own copy of
--   the <a>RTable</a> and thus there is no need for locking the updated
--   <a>RTuple</a>s, as happens in a common RDBMS.
--   
--   Also note that the source and target <a>RTable</a>s should have the
--   same structure. By "structure", we mean that the <a>ColumnName</a>s
--   and the corresponding data types must match. Essentially what we
--   record in the <a>ColumnInfo</a> must be the same for the two
--   <a>RTable</a>s. Otherwise a <a>ConflictingRTableStructures</a>
--   exception will be thrown.
--   
--   <pre>
--   An Example:
--   Source RTable: src = 
--       Id  |   Msg         | Other
--       ----|---------------|-------
--       1   |   "hello2"    |"a"    
--       2   |   "world2"    |"a"    
--       3   |   "new"       |"a"    
--   
--   Target RTable: trg = 
--       Id  |   Msg         | Other
--       ----|---------------|-------
--       1   |   "hello1"    |"b"    
--       2   |   "world1"    |"b"    
--       4   |   "old"       |"b"    
--       5   |   "hello"     |"b"    
--   
--   &gt;&gt;&gt; upsertRTab  src
--                   RUpsertPredicate {matchCols = ["Id"], matchPred = \t1 t2 -&gt; t1 &lt;!&gt; "Id" == t2 &lt;!&gt; "Id" }
--                   ["Msg"]
--                   (\t -&gt;   let 
--                               msg = case toText (t &lt;!&gt; "Msg") of
--                                           Just t -&gt; t
--                                           Nothing -&gt; pack ""
--                           in (take 5 msg) == (pack "hello")
--                   )  -- Msg like "hello%"
--                   trg
--   
--   Result RTable: rslt = 
--       Id  |   Msg         | Other
--       ----|---------------|-------
--       1   |   "hello2"    |"b"   (Note that only column "Msg" has been overwritten, as per the 3rd argument) 
--       2   |   "world1"    |"b"    
--       3   |   "new"       |"a"    
--       4   |   "old"       |"b"    
--       5   |   "hello"     |"b"    
--   </pre>
upsertRTab :: RTable -> RUpsertPredicate -> [ColumnName] -> RPredicate -> RTable -> RTable

-- | Update an RTuple at a specific column specified by name with a value.
--   If the <a>ColumnName</a> exists, then the value is updated with the
--   input value. If the <a>ColumnName</a> does not exist, then a
--   <a>ColumnDoesNotExist</a> exception is thrown.
updateRTuple :: ColumnName -> RDataType -> RTuple -> RTuple

-- | Upsert (update/insert) an RTuple at a specific column specified by
--   name with a value If the cname key is not found then the (columnName,
--   value) pair is inserted. If it exists then the value is updated with
--   the input value.
upsertRTuple :: ColumnName -> RDataType -> RTuple -> RTuple

-- | emptyRTable: Create an empty RTable
emptyRTable :: RTable

-- | Creates an RTable with a single RTuple
createSingletonRTable :: RTuple -> RTable

-- | Creates an RTable from a list of RTuples
rtableFromList :: [RTuple] -> RTable

-- | addColumn: adds a column to an RTable
addColumn :: ColumnName -> RDataType -> RTable -> RTable

-- | removeColumn : removes a column from an RTable. The column is
--   specified by ColumnName. If this ColumnName does not exist in the
--   RTuple of the input RTable then nothing is happened, the RTuple
--   remains intact.
removeColumn :: ColumnName -> RTable -> RTable

-- | Creates an empty RTuple (i.e., one with no column,value mappings)
emptyRTuple :: RTuple

-- | Creates a Null <a>RTuple</a> based on a list of input Column Names. A
--   <a>Null</a> <a>RTuple</a> is an <a>RTuple</a> where all column names
--   correspond to a <a>Null</a> value (<a>Null</a> is a data constructor
--   of <a>RDataType</a>)
createNullRTuple :: [ColumnName] -> RTuple

-- | createRTuple: Create an Rtuple from a list of column names and values
createRTuple :: [(ColumnName, RDataType)] -> RTuple

-- | Create an RTuple from a list
rtupleFromList :: [(ColumnName, RDataType)] -> RTuple

-- | createRDataType: Get a value of type a and return the corresponding
--   RDataType. The input value data type must be an instance of the
--   Typepable typeclass from Data.Typeable
createRDataType :: Typeable a => a -> RDataType

-- | createRTableMData : creates RTableMData from input given in the form
--   of a list We assume that the column order of the input list defines
--   the fixed column order of the RTuple.
createRTableMData :: (RTableName, [(ColumnName, ColumnDType)]) -> [ColumnName] -> [[ColumnName]] -> RTableMData

-- | Get the Column Names of an RTable
getColumnNamesFromRTab :: RTable -> [ColumnName]

-- | Returns the Column Names of an RTuple
getColumnNamesFromRTuple :: RTuple -> [ColumnName]
getColumnInfoFromRTab :: RTable -> [ColumnInfo]
getColumnInfoFromRTuple :: RTuple -> [ColumnInfo]

-- | Take a column value and return its type
getTheType :: RDataType -> ColumnDType

-- | Define equality for two <a>ColumnInfo</a> structures For two column
--   two have "equal structure" they must have the same name and the same
--   type. If one of the two (or both) have an <a>UknownType</a>, then they
--   are still considered of equal structure.
--   
--   Creates a list of the form [(ColumnInfo, RDataType)] from a list of
--   ColumnInfo and an RTuple. The returned list respects the order of the
--   [ColumnInfo]. It guarantees that RDataTypes will be in the same column
--   order as [ColumnInfo], i.e., the correct RDataType for the correct
--   column
listOfColInfoRDataType :: [ColumnInfo] -> RTuple -> [(ColumnInfo, RDataType)]

-- | toListColumnName: returns a list of RTuple column names, in the fixed
--   column order of the RTuple.
toListColumnName :: RTupleMData -> [ColumnName]

-- | toListColumnInfo: returns a list of RTuple columnInfo, in the fixed
--   column order of the RTuple
toListColumnInfo :: RTupleMData -> [ColumnInfo]

-- | Compares the structure of the input <a>RTable</a>s and returns
--   <a>True</a> if these are the same. By "structure", we mean that the
--   <a>ColumnName</a>s and the corresponding data types must match.
--   Essentially what we record in the <a>ColumnInfo</a> must be the same
--   for the two <a>RTable</a>s. Note that in the case of two columns
--   having the same name but one of the two (or both) have a <a>dtype</a>
--   equal to <a>UknownType</a>, then this function assumes that they are
--   the same (i.e., equal <a>ColumnInfo</a>s).
rtabsSameStructure :: RTable -> RTable -> Bool

-- | Compares the structure of the input <a>RTuple</a>s and returns
--   <a>True</a> if these are the same. By "structure", we mean that the
--   <a>ColumnName</a>s and the corresponding data types must match.
--   Essentially what we record in the <a>ColumnInfo</a> must be the same
--   for the two <a>RTuple</a>s
rtuplesSameStructure :: RTuple -> RTuple -> Bool

-- | Receives two lists of <a>ColumnName</a>s and returns the unique list
--   of <a>ColumnName</a>s after concatenating the two and removing the
--   names from the second one that are a prefix of the first one. This
--   function is intended to dedublicate common columns after a join (see
--   <tt>ij</tt>), where <a>ColA</a> for example, will also appear as
--   <a>ColA_1</a>. This function DOES NOT dedublicate columns <a>ColA</a>
--   and <a>ColAsomeSuffix</a>, only cases like this one <a>ColName_Num</a>
--   (e.g., ColName_1, ColName_2, etc.) Here is an example:
--   
--   <pre>
--   &gt;&gt;&gt; getUniqueColumnNames ["ColA","ColB"] ["ColC","ColA", "ColA_1", "ColA_2", "ColA_A", "ColA_hello", "ColAhello"]
--   
--   &gt;&gt;&gt; ["ColA","ColB","ColC","ColA_A","ColA_hello","ColAhello"]
--   </pre>
getUniqueColumnNamesAfterJoin :: [ColumnName] -> [ColumnName] -> [ColumnName]

-- | This exception is thrown whenever we try to access a specific column
--   (i.e., <a>ColumnName</a>) of an <a>RTuple</a> and the column does not
--   exist.
data ColumnDoesNotExist
ColumnDoesNotExist :: ColumnName -> ColumnDoesNotExist

-- | This exception means that we have tried to do some operation between
--   two <tt>RTables</tt>, which requires that the structure of the two is
--   the same. e.g., an <tt>Insert Into <a>TAB</a> RTuples</tt>, or a
--   <tt>UNION</tt> or toher set operations. By "structure", we mean that
--   the <a>ColumnName</a>s and the corresponding data types must match.
--   Essentially what we record in the <a>ColumnInfo</a> must be the same
--   for the two <a>RTable</a>s
data ConflictingRTableStructures

-- | Error message indicating the operation that failed.
ConflictingRTableStructures :: String -> ConflictingRTableStructures

-- | Length mismatch between the format <a>String</a> and the input
--   <a>String</a> data RTimestampFormatLengthMismatch =
--   RTimestampFormatLengthMismatch String String deriving(Eq,Show)
--   instance Exception RTimestampFormatLengthMismatch
--   
--   One (or both) of the input <a>String</a>s to function
--   <a>toRTimestamp</a> are empty
data EmptyInputStringsInToRTimestamp
EmptyInputStringsInToRTimestamp :: String -> String -> EmptyInputStringsInToRTimestamp

-- | This exception is thrown whenever we provide a Timestamp format with
--   not even one valid format pattern
data UnsupportedTimeStampFormat
UnsupportedTimeStampFormat :: String -> UnsupportedTimeStampFormat

-- | This exception means that we have tried an Upsert operation where the
--   source <a>RTable</a> does not have a unique set of <tt>Rtuple</tt>s if
--   grouped by the columns used in the matching condition. This simply
--   means that we cannot determine which of the dublicate <a>RTuple</a>s
--   in the source <a>RTable</a> will overwrite the target <a>RTable</a>,
--   when the matching condition is satisfied.
data UniquenessViolationInUpsert

-- | Error message
UniquenessViolationInUpsert :: String -> UniquenessViolationInUpsert

-- | printRTable : Print the input RTable on screen
printRTable :: RTable -> IO ()

-- | Safe <a>printRTable</a> alternative that returns an <a>Either</a>, so
--   as to give the ability to handle exceptions gracefully, during the
--   evaluation of the input RTable. Example:
--   
--   <pre>
--   do 
--    p &lt;- (eitherPrintRTable  printRTable myRTab) :: IO (Either SomeException ())
--    case p of
--              Left exc -&gt; putStrLn $ "There was an error in the Julius evaluation: " ++ (show exc)
--              Right _  -&gt; return ()
--   </pre>
eitherPrintRTable :: Exception e => (RTable -> IO ()) -> RTable -> IO (Either e ())

-- | prints an RTable with an RTuple format specification. It can be used
--   instead of <a>printRTable</a> when one of the following two is
--   required:
--   
--   <ul>
--   <li>a) When we want to specify the order that the columns will be
--   printed on screen</li>
--   <li>b) When we want to specify the formatting of the values by using a
--   <a>printf</a>-like <a>FormatSpecifier</a></li>
--   </ul>
printfRTable :: RTupleFormat -> RTable -> IO ()

-- | Safe <tt>printRfTable</tt> alternative that returns an <a>Either</a>,
--   so as to give the ability to handle exceptions gracefully, during the
--   evaluation of the input RTable. Example:
--   
--   <pre>
--   do 
--    p &lt;- (eitherPrintfRTable printfRTable myFormat myRTab) :: IO (Either SomeException ())
--    case p of
--              Left exc -&gt; putStrLn $ "There was an error in the Julius evaluation: " ++ (show exc)
--              Right _  -&gt; return ()
--   </pre>
eitherPrintfRTable :: Exception e => (RTupleFormat -> RTable -> IO ()) -> RTupleFormat -> RTable -> IO (Either e ())

-- | Basic data type for defining the desired formatting of an
--   <a>RTuple</a> when printing an RTable (see <a>printfRTable</a>).
data RTupleFormat
RTupleFormat :: [ColumnName] -> ColFormatMap -> RTupleFormat

-- | For defining the column ordering (i.e., the SELECT clause in SQL)
[colSelectList] :: RTupleFormat -> [ColumnName]

-- | For defining the formating per Column in "<a>printf</a> style"
[colFormatMap] :: RTupleFormat -> ColFormatMap

-- | A map of ColumnName to Format Specification
type ColFormatMap = HashMap ColumnName FormatSpecifier

-- | Format specifier of <a>printf</a> style
data FormatSpecifier
DefaultFormat :: FormatSpecifier
Format :: String -> FormatSpecifier

-- | A sum type to help the specification of a column ordering (Ascending,
--   or Descending)
data OrderingSpec
Asc :: OrderingSpec
Desc :: OrderingSpec

-- | Generate an RTupleFormat data type instance
genRTupleFormat :: [ColumnName] -> ColFormatMap -> RTupleFormat

-- | Generate a default RTupleFormat data type instance. In this case the
--   returned column order (Select list), will be unspecified and dependant
--   only by the underlying structure of the <a>RTuple</a> (<a>HashMap</a>)
genRTupleFormatDefault :: RTupleFormat

-- | Generates a Column Format Specification
genColFormatMap :: [(ColumnName, FormatSpecifier)] -> ColFormatMap

-- | Generates a default Column Format Specification
genDefaultColFormatMap :: ColFormatMap
instance GHC.Show.Show RTable.Core.UniquenessViolationInUpsert
instance GHC.Classes.Eq RTable.Core.UniquenessViolationInUpsert
instance GHC.Show.Show RTable.Core.ConflictingRTableStructures
instance GHC.Classes.Eq RTable.Core.ConflictingRTableStructures
instance GHC.Show.Show RTable.Core.EmptyInputStringsInToRTimestamp
instance GHC.Classes.Eq RTable.Core.EmptyInputStringsInToRTimestamp
instance GHC.Show.Show RTable.Core.UnsupportedTimeStampFormat
instance GHC.Classes.Eq RTable.Core.UnsupportedTimeStampFormat
instance GHC.Show.Show RTable.Core.ColumnDoesNotExist
instance GHC.Classes.Eq RTable.Core.ColumnDoesNotExist
instance GHC.Show.Show RTable.Core.RTupleFormat
instance GHC.Classes.Eq RTable.Core.RTupleFormat
instance GHC.Show.Show RTable.Core.FormatSpecifier
instance GHC.Classes.Eq RTable.Core.FormatSpecifier
instance GHC.Classes.Eq RTable.Core.OrderingSpec
instance GHC.Show.Show RTable.Core.OrderingSpec
instance GHC.Classes.Eq RTable.Core.RTableMData
instance GHC.Show.Show RTable.Core.RTableMData
instance GHC.Classes.Eq RTable.Core.ColumnInfo
instance GHC.Show.Show RTable.Core.ColumnInfo
instance GHC.Generics.Generic RTable.Core.RDataType
instance GHC.Read.Read RTable.Core.RDataType
instance GHC.Show.Show RTable.Core.RDataType
instance GHC.Generics.Generic RTable.Core.RTimestamp
instance GHC.Read.Read RTable.Core.RTimestamp
instance GHC.Show.Show RTable.Core.RTimestamp
instance GHC.Show.Show RTable.Core.IgnoreDefault
instance GHC.Classes.Eq RTable.Core.IgnoreDefault
instance GHC.Classes.Eq RTable.Core.ColumnDType
instance GHC.Show.Show RTable.Core.ColumnDType
instance GHC.Exception.Type.Exception RTable.Core.UniquenessViolationInUpsert
instance GHC.Exception.Type.Exception RTable.Core.ConflictingRTableStructures
instance GHC.Exception.Type.Exception RTable.Core.EmptyInputStringsInToRTimestamp
instance GHC.Exception.Type.Exception RTable.Core.UnsupportedTimeStampFormat
instance GHC.Exception.Type.Exception RTable.Core.ColumnDoesNotExist
instance Control.DeepSeq.NFData RTable.Core.RDataType
instance GHC.Classes.Eq RTable.Core.RDataType
instance GHC.Classes.Ord RTable.Core.RDataType
instance GHC.Num.Num RTable.Core.RDataType
instance GHC.Real.Fractional RTable.Core.RDataType
instance Control.DeepSeq.NFData RTable.Core.RTimestamp
instance GHC.Classes.Eq RTable.Core.RTimestamp
instance GHC.Classes.Ord RTable.Core.RTimestamp


-- | This is an internal module (i.e., not to be imported directly) that
--   implements the core ETL functionality that is exposed via the
--   <b>Julius</b> EDSL for ETL/ELT found in the <a>Etl.Julius</a> module)
module Etl.Internal.Core

-- | This is the basic data type to define the column-to-column mapping
--   from a source <a>RTable</a> to a target <a>RTable</a>. Essentially, an
--   <a>RColMapping</a> represents the column-level transformations of an
--   <a>RTuple</a> that will yield a target <a>RTuple</a>.
--   
--   A mapping is simply a triple of the form ( Source-Column(s),
--   Target-Column(s), Transformation, RTuple-Filter), where we define the
--   source columns over which a transformation (i.e. a function) will be
--   applied in order to yield the target columns. Also, an
--   <a>RPredicate</a> (i.e. a filter) might be applied on the source
--   <a>RTuple</a>. Remember that an <a>RTuple</a> is essentially a mapping
--   between a key (the Column Name) and a value (the <a>RDataType</a>
--   value). So the various <a>RColMapping</a> data constructors below
--   simply describe the possible modifications of an <a>RTuple</a>
--   orginating from its own columns.
--   
--   So, we can have the following mapping types: a) single-source column
--   to single-target column mapping (1 to 1), the source column will be
--   removed or not based on the <a>removeSrcCol</a> flag (dublicate column
--   names are not allowed in an <a>RTuple</a>) b) multiple-source columns
--   to single-target column mapping (N to 1), The N columns will be merged
--   to the single target column based on the transformation. The N columns
--   will be removed from the RTuple or not based on the
--   <a>removeSrcCol</a> flag (dublicate column names are not allowed in an
--   <a>RTuple</a>) c) single-source column to multiple-target columns
--   mapping (1 to M) the source column will be "expanded" to M target
--   columns based ont he transformation. the source column will be removed
--   or not based on the <a>removeSrcCol</a> flag (dublicate column names
--   are not allowed in an <a>RTuple</a>) d) multiple-source column to
--   multiple target columns mapping (N to M) The N source columns will be
--   mapped to M target columns based on the transformation. The N columns
--   will be removed from the RTuple or not based on the
--   <a>removeSrcCol</a> flag (dublicate column names are not allow in an
--   <a>RTuple</a>)
--   
--   Some examples of mapping are the following:
--   
--   <pre>
--   (<a>Start_Date</a>, No, <a>StartDate</a>, t -&gt; True)  --  copy the source value to target and dont remove the source column, so the target RTuple will have both columns <a>Start_Date</a> and <a>StartDate</a>
--                                        --  with the exactly the same value)
--   
--   ([<a>Amount</a>, <a>Discount</a>], Yes, <a>FinalAmount</a>, ([a, d] -&gt; a * d) ) -- <a>FinalAmount</a> is a derived column based on a function applied to the two source columns. 
--                                                                     --  In the final RTuple we remove the two source columns.
--   
--   </pre>
--   
--   An <a>RColMapping</a> can be applied with the <a>runCM</a>
--   (runColMapping) operator
data RColMapping
ColMapEmpty :: RColMapping

-- | single-source column to single-target column mapping (1 to 1).
RMap1x1 :: ColumnName -> YesNo -> ColumnName -> (RDataType -> RDataType) -> RPredicate -> RColMapping
[srcCol] :: RColMapping -> ColumnName
[removeSrcCol] :: RColMapping -> YesNo
[trgCol] :: RColMapping -> ColumnName
[transform1x1] :: RColMapping -> RDataType -> RDataType
[srcRTupleFilter] :: RColMapping -> RPredicate

-- | multiple-source columns to single-target column mapping (N to 1)
RMapNx1 :: [ColumnName] -> YesNo -> ColumnName -> ([RDataType] -> RDataType) -> RPredicate -> RColMapping
[srcColGrp] :: RColMapping -> [ColumnName]
[removeSrcCol] :: RColMapping -> YesNo
[trgCol] :: RColMapping -> ColumnName
[transformNx1] :: RColMapping -> [RDataType] -> RDataType
[srcRTupleFilter] :: RColMapping -> RPredicate

-- | single-source column to multiple-target columns mapping (1 to N)
RMap1xN :: ColumnName -> YesNo -> [ColumnName] -> (RDataType -> [RDataType]) -> RPredicate -> RColMapping
[srcCol] :: RColMapping -> ColumnName
[removeSrcCol] :: RColMapping -> YesNo
[trgColGrp] :: RColMapping -> [ColumnName]
[transform1xN] :: RColMapping -> RDataType -> [RDataType]
[srcRTupleFilter] :: RColMapping -> RPredicate

-- | multiple-source column to multiple target columns mapping (N to M)
RMapNxM :: [ColumnName] -> YesNo -> [ColumnName] -> ([RDataType] -> [RDataType]) -> RPredicate -> RColMapping
[srcColGrp] :: RColMapping -> [ColumnName]
[removeSrcCol] :: RColMapping -> YesNo
[trgColGrp] :: RColMapping -> [ColumnName]
[transformNxM] :: RColMapping -> [RDataType] -> [RDataType]
[srcRTupleFilter] :: RColMapping -> RPredicate

-- | A Column Transformation function data type. It is used in order to
--   define an arbitrary column-level transformation (i.e., from a list of
--   N input Column-Values we produce a list of M derived (output)
--   Column-Values). A Column value is represented with the
--   <a>RDataType</a>.
type ColXForm = [RDataType] -> [RDataType]

-- | Constructs an RColMapping. This is the suggested method for creating a
--   column mapping and not by calling the data constructors directly.
createColMapping :: [ColumnName] -> [ColumnName] -> ColXForm -> YesNo -> RPredicate -> RColMapping

-- | An ETL operation applied to an RTable can be either an
--   <a>ROperation</a> (a relational agebra operation like join, filter
--   etc.) defined in <a>RTable.Core</a> module, or an <a>RColMapping</a>
--   applied to an <a>RTable</a>
data ETLOperation
ETLrOp :: ROperation -> ETLOperation
[rop] :: ETLOperation -> ROperation
ETLcOp :: RColMapping -> ETLOperation
[cmap] :: ETLOperation -> RColMapping

-- | ETLmapping : it is the equivalent of a mapping in an ETL tool and
--   consists of a series of ETLOperations that are applied, one-by-one, to
--   some initial input RTable, but if binary ETLOperations are included in
--   the ETLMapping, then there will be more than one input RTables that
--   the ETLOperations of the ETLMapping will be applied to. When we apply
--   (i.e., run) an ETLOperation of the ETLMapping we get a new RTable,
--   which is then inputed to the next ETLOperation, until we finally run
--   all ETLOperations. The purpose of the execution of an ETLMapping is to
--   produce a single new RTable as the result of the execution of all the
--   ETLOperations of the ETLMapping. In terms of database operations an
--   ETLMapping is the equivalent of an CREATE AS SELECT (CTAS) operation
--   in an RDBMS. This means that anything that can be done in the SELECT
--   part (i.e., column projection, row filtering, grouping and join
--   operations, etc.) in order to produce a new table, can be included in
--   an ETLMapping.
--   
--   An ETLMapping is executed with the etl (runETLmapping) operator
--   
--   Implementation: An ETLMapping is implemented as a binary tree where
--   the node represents the ETLOperation to be executed and the left
--   branch is another ETLMapping, while the right branch is an RTable
--   (that might be empty in the case of a Unary ETLOperation). Execution
--   proceeds from bottom-left to top-right. This is similar in concept to
--   a left-deep join tree. In a Left-Deep ETLOperation tree the "pipe" of
--   ETLOperations comes from the left branches always. The leaf node is
--   always an ETLMapping with an ETLMapEmpty in the left branch and an
--   RTable in the right branch (the initial RTable inputed to the
--   ETLMapping). In this way, the result of the execution of each
--   ETLOperation (which is an RTable) is passed on to the next
--   ETLOperation. Here is an example:
--   
--   <pre>
--       A Left-Deep ETLOperation Tree
--   
--                                final RTable result
--                                      / 
--                                   etlOp3 
--                                /        
--                             etlOp2     rtab2
--                            /       
--   A leaf-node --&gt;    etlOp1    emptyRTab
--                      /       
--                ETLMapEmpty   rtab1
--   </pre>
--   
--   You see that always on the left branch we have an ETLMapping data type
--   (i.e., a left-deep ETLOperation tree). So how do we implement the
--   following case?
--   
--   <pre>
--                      final RTable result
--                              / 
--   A leaf-node --&gt;         etlOp1 
--                           /       
--                          rtab1   rtab2
--   </pre>
--   
--   The answer is that we "model" the left RTable (rtab1 in our example)
--   as an ETLMapping of the form:
--   
--   <pre>
--   ETLMapLD { etlOp = ETLcOp{cmap = ColMapEmpty}, tabL = ETLMapEmpty, tabR = rtab1 }
--   </pre>
--   
--   So we embed the rtab1 in a ETLMapping, which is a leaf (i.e., it has
--   an empty prevMap), the rtab1 is in the right branch (tabR) and the
--   ETLOperation is the EmptyColMapping, which returns its input RTable
--   when executed. We can use function <a>rtabToETLMapping</a> for this
--   job. So it becomes <tt> A leaf-node --&gt; etlOp1 / rtabToETLMapping
--   rtab1 rtab2 </tt>
--   
--   In this manner, a leaf-node can also be implemented like this:
--   
--   <pre>
--                                final RTable result
--                                      / 
--                                   etlOp3 
--                                /        
--                             etlOp2     rtab2
--                            /       
--   A leaf-node --&gt;    etlOp1    emptyRTab
--                      /     
--     rtabToETLMapping rtab1  emptyRTable
--   </pre>
data ETLMapping

-- | an empty node
ETLMapEmpty :: ETLMapping

-- | a Left-Deep node
ETLMapLD :: ETLOperation -> ETLMapping -> RTable -> ETLMapping

-- | the ETLOperation to be executed
[etlOp] :: ETLMapping -> ETLOperation

-- | the left-branch corresponding to the previous ETLOperation, which is
--   input to this one.
[tabL] :: ETLMapping -> ETLMapping

-- | the right branch corresponds to another RTable (for binary ETL
--   operations). If this is a Unary ETLOperation then this field must be
--   an empty RTable.
[tabR] :: ETLMapping -> RTable

-- | a Right-Deep node
ETLMapRD :: ETLOperation -> RTable -> ETLMapping -> ETLMapping

-- | the ETLOperation to be executed
[etlOp] :: ETLMapping -> ETLOperation

-- | the left-branch corresponds to another RTable (for binary ETL
--   operations). If this is a Unary ETLOperation then this field must be
--   an empty RTable.
[tabLrd] :: ETLMapping -> RTable

-- | the right branch corresponding to the previous ETLOperation, which is
--   input to this one.
[tabRrd] :: ETLMapping -> ETLMapping

-- | a Balanced node
ETLMapBal :: ETLOperation -> ETLMapping -> ETLMapping -> ETLMapping

-- | the ETLOperation to be executed
[etlOp] :: ETLMapping -> ETLOperation

-- | the left-branch corresponding to the previous ETLOperation, which is
--   input to this one. If this is a Unary ETLOperation then this field
--   might be an empty ETLMapping.
[tabLbal] :: ETLMapping -> ETLMapping

-- | the right branch corresponding corresponding to the previous
--   ETLOperation, which is input to this one. -- If this is a Unary
--   ETLOperation then this field might be an empty ETLMapping.
[tabRbal] :: ETLMapping -> ETLMapping
data YesNo
Yes :: YesNo
No :: YesNo

-- | runCM operator executes an RColMapping If a target-column has the same
--   name with a source-column and a DontRemoveSrc (i.e., removeSrcCol ==
--   No) has been specified, then the (target-column, target-value)
--   key-value pair, overwrites the corresponding (source-column,
--   source-value) key-value pair
runCM :: RColMapping -> RTable -> RTable

-- | executes a Unary ETL Operation
etlOpU :: ETLOperation -> RTable -> RTable

-- | executes a Binary ETL Operation
etlOpB :: ETLOperation -> RTable -> RTable -> RTable

-- | This operator executes an <a>ETLMapping</a>
etl :: ETLMapping -> RTable

-- | This operator executes an <a>ETLMapping</a> and returns the
--   <a>RTabResult</a> <tt>Writer</tt> Monad that embedds apart from the
--   resulting RTable, also the number of <a>RTuple</a>s returned
etlRes :: ETLMapping -> RTabResult

-- | Model an <a>RTable</a> as an <a>ETLMapping</a> which when executed
--   will return the input <a>RTable</a>
rtabToETLMapping :: RTable -> ETLMapping

-- | Creates a left-deep leaf ETL Mapping, of the following form:
--   
--   <pre>
--       A Left-Deep ETLOperation Tree
--   
--                                final RTable result
--                                      / 
--                                   etlOp3 
--                                /        
--                             etlOp2     rtab2
--                            /       
--   A leaf-node --&gt;    etlOp1    emptyRTab
--                      /       
--                ETLMapEmpty   rtab1
--   </pre>
createLeafETLMapLD :: ETLOperation -> RTable -> ETLMapping

-- | creates a Binary operation leaf node of the form:
--   
--   <pre>
--   A leaf-node --&gt;    etlOp1    
--                      /     
--     rtabToETLMapping rtab1  rtab2
--   </pre>
createLeafBinETLMapLD :: ETLOperation -> RTable -> RTable -> ETLMapping

-- | Connects an ETL Mapping to a left-deep ETL Mapping tree, of the form
--   
--   <pre>
--       A Left-Deep ETLOperation Tree
--   
--                                final RTable result
--                                      / 
--                                   etlOp3 
--                                /        
--                             etlOp2     rtab2
--                            /       
--   A leaf-node --&gt;    etlOp1    emptyRTab
--                      /       
--                ETLMapEmpty   rtab1
--   </pre>
--   
--   Example:
--   
--   <pre>
--   -- connect a Unary ETL mapping (etlOp2)
--   
--                           etlOp2    
--                          /       
--                       etlOp1    emptyRTab
--        
--   =&gt; connectETLMapLD etlOp2 emptyRTable prevMap
--   
--   -- connect a Binary ETL Mapping (etlOp3)
--   
--                                 etlOp3 
--                              /        
--                           etlOp2     rtab2
--   
--   =&gt; connectETLMapLD etlOp3 rtab2 prevMap
--   </pre>
--   
--   Note that the right branch (RTable) appears first in the list of input
--   arguments of this function and the left branch (ETLMapping) appears
--   second. This is strange, and one could thought that it is a mistake
--   (i.e., the left branch should appear first and the right branch
--   second) since we are reading from left to right. However this was a
--   deliberate choice, so that we leave the left branch (which is the
--   connection point with the previous ETLMapping) as the last argument,
--   and thus we can partially apply the argumenets and get a new function
--   with input parameter only the previous mapping. This is very helpfull
--   in function composition
connectETLMapLD :: ETLOperation -> RTable -> ETLMapping -> ETLMapping
instance GHC.Show.Show Etl.Internal.Core.YesNo
instance GHC.Classes.Eq Etl.Internal.Core.YesNo
instance GHC.Classes.Eq Etl.Internal.Core.ETLMapping


-- | <b>Julius</b> is a type-level <i>Embedded Domain Specific Language
--   (EDSL)</i> for ETL/ELT data processing in Haskell. Julius enables us
--   to express complex data transformation flows (i.e., an arbitrary
--   combination of ETL operations) in a more friendly manner (a <b>Julius
--   Expression</b>), with plain Haskell code (no special language for ETL
--   scripting required). For more information read this <a>Julius
--   Tutorial</a>.
--   
--   <h1>When to use this module</h1>
--   
--   This module should be used whenever one has "tabular data" (e.g., some
--   CSV files, or any type of data that can be an instance of the
--   <a>RTabular</a> type class and thus define the <a>toRTable</a> and
--   <a>fromRTable</a> functions) and wants to analyze them in-memory with
--   the well-known relational algebra operations (selection, projection,
--   join, groupby, aggregations etc) that lie behind SQL. This data
--   analysis takes place within your haskell code, without the need to
--   import the data into a database (database-less data processing) and
--   the result can be turned into the original format (e.g., CSV) with a
--   simple call to the <a>fromRTable</a> function.
--   
--   <a>Etl.Julius</a> provides a simple language for expressing all
--   relational algebra operations and arbitrary combinations of them, and
--   thus is a powerful tool for expressing complex data transfromations in
--   Haskell. Moreover, the Julius language includes a clause for the
--   <b>Column Mapping</b> (<a>RColMapping</a>) concept, which is a
--   construct used in ETL tools and enables arbitrary transformations at
--   the column level and the creation of derived columns based on
--   arbitrary expressions on the existing ones. Finally, the ad hoc
--   combination of relational operations and Column Mappings, chained in
--   an data transformation flow, implements the concept of the <b>ETL
--   Mapping</b> (<a>ETLMapping</a>), which is the core data mapping unit
--   in all ETL tools and embeds all the "ETL-logic" for loading/creating a
--   single <b>target</b> <a>RTable</a> from a set of <b>source</b>
--   <a>RTable</a>s. It is implemented in the <a>ETL.Internal.Core</a>
--   module. For the relational algebra operations, Julius exploits the
--   functions in the <a>RTable.Core</a> module, which also exports it.
--   
--   The <a>Julius EDSL</a> is the recommended method for expressing ETL
--   flows in Haskell, as well as doing any data analysis task within the
--   <a>DBFunctor</a> package. <a>Etl.Julius</a> is a self-sufficient
--   module and imports all neccesary functionality from <a>RTable.Core</a>
--   and <a>Etl.Internal.Core</a> modules, so a programmer should only
--   import <a>Etl.Julius</a> and nothing else, in order to have complete
--   functionality.
--   
--   <h1>Overview</h1>
--   
--   The core data type in the Julius EDSL is the <a>ETLMappingExpr</a>.
--   This data type creates a so-called <b>Julius Expression</b>. This
--   Julius expression is the "Haskell equivalent" to the ETL Mapping
--   concept discussed above. It is evaluated to an <a>ETLMapping</a> (see
--   <a>ETLMapping</a>), which is our data structure for the internal
--   representation of the ETL Mapping, with the <a>evalJulius</a> function
--   and from then, evaluated into an <a>RTable</a> (see
--   <a>juliusToRTable</a>), which is the final result of our
--   transformation.
--   
--   A <i>Julius Expression</i> is a chain of ETL Operation Expressions
--   (<tt>EtlOpExpr</tt>) connected with the <a>:-&gt;</a> constructor (or
--   with the <a>:=&gt;</a> constructor for named result operations - see
--   below for an explanation) This chain of ETL Operations always starts
--   with the <a>EtlMapStart</a> constructor and is executed from
--   left-to-right, or from top-to-bottom:
--   
--   <pre>
--   EtlMapStart :-&gt; &lt;ETLOpExpr&gt; :-&gt; &lt;ETLOpExpr&gt; :-&gt; ... :-&gt; &lt;ETLOpExpr&gt;
--   -- equivalently
--   EtlMapStart :-&gt; &lt;ETLOpExpr&gt; 
--               :-&gt; &lt;ETLOpExpr&gt; 
--               :-&gt; ... 
--               :-&gt; &lt;ETLOpExpr&gt;
--   </pre>
--   
--   A Named ETL Operation Expression (<a>NamedMap</a>) is just an ETL
--   Operation with a name, so as to be able to reference this specific
--   step in the chain of ETL Operations. It is actually <b>a named
--   intermediate result</b>, which can reference and use in other parts of
--   our Julius expression. It is similar in notion to a subquery, known as
--   an INLINE VIEW, or better, it is equivalent to the <tt>WITH</tt>
--   clause in SQL (i.e., also called subquery factoring in SQL parlance)
--   For example:
--   
--   <pre>
--   EtlMapStart :-&gt; &lt;ETLOpExpr&gt; 
--               :=&gt; NamedResult "my_intermdt_result" &lt;ETLOpExpr&gt; 
--               :-&gt; ... 
--               :-&gt; &lt;ETLOpExpr&gt;
--   </pre>
--   
--   An ETL Operation Expression (<a>ETLOpExpr</a>) - a.k.a. a <b>Julius
--   Expression</b> - is either a Column Mapping Expression
--   (<a>ColMappingExpr</a>), or a Relational Operation Expression
--   (<a>ROpExpr</a>). The former is used in order to express a <b>Column
--   Mapping</b> (i.e., an arbitrary transformation at the column level,
--   with which we can create any derived column based on existing columns,
--   see <a>RColMapping</a>) and the latter is <b>Relational Operation</b>
--   (Selection, Projection, Join, Outer Join, Group By, Order By,
--   Aggregate, or a generic Unary or Binary RTable operation, see
--   <a>ROperation</a>)
--   
--   <h2>Typical Structure of an ETL Program using <a>Etl.Julius</a></h2>
--   
--   <pre>
--   import     Etl.Julius
--   import     RTable.Data.CSV     (CSV, readCSV, writeCSV, toRTable)
--   
--   -- 1. Define table metadata
--   -- E.g.,
--   src_DBTab_MData :: RTableMData
--   src_DBTab_MData = 
--       createRTableMData   (   "sourceTab"  -- table name
--                               ,[  ("OWNER", Varchar)                                      -- Owner of the table
--                                   ,("TABLE_NAME", Varchar)                                -- Name of the table
--                                   ,("TABLESPACE_NAME", Varchar)                           -- Tablespace name
--                                   ,("STATUS",Varchar)                                     -- Status of the table object (VALID/IVALID)
--                                   ,("NUM_ROWS", Integer)                                  -- Number of rows in the table
--                                   ,("BLOCKS", Integer)                                    -- Number of Blocks allocated for this table
--                                   ,("LAST_ANALYZED", Timestamp "MM<i>DD</i>YYYY HH24:MI:SS")   -- Timestamp of the last time the table was analyzed (i.e., gathered statistics) 
--                               ]
--                           )
--                           ["OWNER", "TABLE_NAME"] -- primary key
--                           [] -- (alternative) unique keys    
--   
--   -- Result RTable metadata
--   result_tab_MData :: RTableMData
--   result_tab_MData = ...
--   
--   -- 2. Define your ETL code
--   -- E.g.,
--   myEtl :: [RTable] -&gt; [RTable]
--   myEtl [rtab] = 
--       -- 3. Define your Julius Expression(s)
--       let jul = 
--               EtlMapStart
--               :-&gt; (EtlR $
--                       ROpStart  
--                       :. (...)
--               ...
--       -- 4. Evaluate Julius to the Result RTable
--       in [juliusToRTable jul]
--   
--   main :: IO ()
--   main = do
--       
--       -- 5. read source csv files
--       -- E.g.,
--       srcCSV &lt;- readCSV "./app/test-data.csv"
--   
--       -- 6. Convert CSV to an RTable and do your ETL
--       [resultRTab] &lt;- runETL myETL $ [toRTable src_DBTab_MData srcCSV]
--   
--       -- 7. Print your results on screen
--       -- E.g.,
--       printfRTable (genRTupleFormat ["OWNER", "TABLE_NAME","LAST_ANALYZED"] genDefaultColFormatMap) $ resultRTab
--   
--       -- 8. Save your result to a CSV file
--       -- E.g.,
--       writeCSV "./app/result-data.csv" $ 
--                       fromRTable result_tab_MData resultRTab
--   </pre>
--   
--   <ul>
--   <li><i>-- 1.</i> We define the necessary <a>RTable</a> metadata, for
--   each <a>RTable</a> in our program. This is equivalent to a <tt>CREATE
--   TABLE</tt> ddl clause in SQL.</li>
--   <li><i>-- 2.</i> Here is where we define our ETL code. We dont want to
--   do our ETL in the main function, so we separate the ETL code into a
--   separate function (<tt>myETL</tt>). In general, in our main, we want
--   to follow the pattern:<ul><li>Read your Input (Extract
--   phase)</li><li>Do your ETL (Transform phase)</li><li>Write your Output
--   (Load phase)</li></ul></li>
--   </ul>
--   
--   This function receives as input a list with all the necessary
--   <b>Source</b> <a>RTable</a>s (in our case we have a single item list)
--   and outputs a list with all the resulting (<b>Target</b>)
--   <a>RTable</a>, after the all the necessary transformation steps have
--   been executed. Of course an ETL code might produce more than one
--   target <a>RTable</a>s, e.g., a target schema (in DB parlance) and not
--   just one as in our example. Moreover, the myETL function can be
--   arbitrary complex, depending on the ETL logic that we want to
--   implement in each case. It is essentially the entry point to our ETL
--   implementation
--   
--   <ul>
--   <li><i>-- 3.</i> Our ETL code in general will consist of an arbitrary
--   number of Julius expressions. One can define multiple separate Julius
--   expressions, some of which might depend on others, in order to
--   implement the corresponding ETL logic. Keep in mind that each Julius
--   expression encapsulates the "transformtioin logic" for producing a
--   <b>single target RTable</b>. This holds, even if the target RTable is
--   an intermediate result in the overall ETL process and not a final
--   result RTable.</li>
--   </ul>
--   
--   The evaluation of each individual Julius expression must be in
--   conformance with the input-RTable prerequisites of each Julius
--   expression. So, first we must evaluate all the Julius expressions that
--   dont depend on other Julius expressions but only on source RTables.
--   Then, we evaluate the Julius expressions that depend on the previous
--   ones and so on.
--   
--   <ul>
--   <li><i>-- 4.</i> In our case our ETL code consists of a single source
--   RTable that produces a single target RTable. The Julius expression is
--   evaluated into an <a>RTable</a> and returned to the caller of the ETL
--   code (in our case this is <tt>main</tt>)</li>
--   <li><i>-- 5.</i> Here is where we read our input for the ETL. In our
--   case, this is a simple CSV file that we read with the help of the
--   <tt>readCSV</tt> function.</li>
--   <li><i>-- 6.</i> We convert our input CSV to an <a>RTable</a>, with
--   the <a>toRTable</a> and pass it as input to our ETL code. We execute
--   our ETL code with the <a>runETL</a> function.</li>
--   <li><i>-- 7.</i> We print our target <a>RTable</a> on screen using the
--   <a>printfRTable</a> function for formatted printed (<a>printf</a>
--   like) of <a>RTable</a>s.</li>
--   <li><i>-- 8.</i> We save our target <a>RTable</a> to a CSV file with
--   the <a>fromRTable</a> function.</li>
--   </ul>
--   
--   <h2>Simple Julius Expression Examples</h2>
--   
--   Note: Julius Expression are read from top to bottom and from left to
--   right.
--   
--   <h3>Selection (i.e., Filter)</h3>
--   
--   <h4>SQL</h4>
--   
--   <pre>
--   SELECT * 
--   FROM expenses exp 
--   WHERE   exp.category = 'FOOD:SUPER_MARKET' 
--           AND exp.amount &gt; 50.00
--   </pre>
--   
--   <h4>Julius</h4>
--   
--   <pre>
--   juliusToRTable $           
--       EtlMapStart
--       :-&gt; (EtlR $
--               ROpStart  
--               :. (Filter (From $ Tab expenses) $ FilterBy myFpred))
--   
--   myFpred :: RPredicate
--   myFpred = \t -&gt;    t &lt;!&gt; "category" == "FOOD:SUPER_MARKET" 
--                       &amp;&amp; 
--                       t &lt;!&gt; "amount" &gt; 50.00
--   </pre>
--   
--   <h3>Projection </h3>
--   
--   <h4>SQL</h4>
--   
--   <pre>
--   SELECT "TxTimeStamp", "Category", "Description", "Amount" 
--   FROM expenses exp 
--   WHERE   exp.category = 'FOOD:SUPER_MARKET' 
--           AND exp.amount &gt; 50.00
--   </pre>
--   
--   <h4>Julius</h4>
--   
--   <pre>
--   juliusToRTable $           
--       EtlMapStart
--       :-&gt; (EtlR $
--               ROpStart  
--               :. (Filter (From $ Tab expenses) $ FilterBy myFpred)
--               :. (Select ["TxTimeStamp", "Category","Description", "Amount"] $ From Previous))
--   
--   myFpred :: RPredicate
--   myFpred = \t -&gt; t &lt;!&gt; "category" == "FOOD:SUPER_MARKET" 
--                   &amp;&amp; 
--                   t &lt;!&gt; "amount" &gt; 50.00
--   </pre>
--   
--   <h3>Sorting (Order By)</h3>
--   
--   <h4>SQL</h4>
--   
--   <pre>
--   SELECT "TxTimeStamp", "Category", "Description", "Amount" 
--   FROM expenses exp 
--   WHERE   exp.category = 'FOOD:SUPER_MARKET' 
--           AND exp.amount &gt; 50.00
--   ORDER BY "TxTimeStamp" DESC        
--   </pre>
--   
--   <h4>Julius</h4>
--   
--   <pre>
--   juliusToRTable $           
--       EtlMapStart
--       :-&gt; (EtlR $
--               ROpStart  
--               :. (Filter (From $ Tab expenses) $ FilterBy myFpred)
--               :. (Select ["TxTimeStamp", "Category","Description", "Amount"] $ From Previous)
--               :. (OrderBy [("TxTimeStamp", Desc)] $ From Previous))
--   
--   myFpred :: RPredicate
--   myFpred = \t -&gt; t &lt;!&gt; "category" == "FOOD:SUPER_MARKET" 
--                   &amp;&amp; 
--                   t &lt;!&gt; "amount" &gt; 50.00
--   </pre>
--   
--   <h3>Grouping and Aggregation</h3>
--   
--   <h4>SQL</h4>
--   
--   <pre>
--   SELECT "Category",  sum("Amount") AS "TotalAmount"
--   FROM expenses exp 
--   GROUP BY  "Category"
--   ORDER BY "TotalAmount" DESC        
--   </pre>
--   
--   <h4>Julius</h4>
--   
--   <pre>
--   juliusToRTable $           
--       EtlMapStart
--       :-&gt; (EtlR $
--               ROpStart  
--               :. (GroupBy ["Category"]  
--                           (AggOn [Sum "Amount" $ As "TotalAmount"] (From $ Tab expenses)) $ 
--                           GroupOn  (t1 t2 -&gt; t1 &lt;!&gt; "Category" == t2 &lt;!&gt; "Category")                                                                         
--                   )
--               :. (OrderBy [ ("TotalAmount", Desc)] $ From Previous))
--   </pre>
--   
--   <h3>Group By and then Right Outer Join </h3>
--   
--   First group the expenses table by category of expenses and then do a
--   right outer join with the budget table, in order to pair the expenses
--   at the category level with the correpsonding budget amount (if there
--   is one). Preserve the rows from the expenses table.
--   
--   <h4>SQL</h4>
--   
--   <pre>
--   WITH exp
--   as (
--       SELECT "Category",  sum("Amount") AS "TotalAmount"
--       FROM expenses
--       GROUP BY  "Category"
--   )
--   SELECT  exp."Category", exp."TotalAmount", bdg."YearlyBudget"
--   FROM budget bdg RIGHT JOIN exp ON (bdg."Category" = exp."Category")
--   ORDER BY exp."TotalAmount" DESC        
--   </pre>
--   
--   <h4>Julius</h4>
--   
--   <pre>
--   juliusToRTable $           
--       EtlMapStart
--       :-&gt; (EtlR $
--               ROpStart  
--               :. (GroupBy ["Category"]  
--                           (AggOn [Sum "Amount" $ As "TotalAmount"] (From $ Tab expenses)) $ 
--                           GroupOn  (t1 t2 -&gt; t1 &lt;!&gt; "Category" == t2 &lt;!&gt; "Category")                                                                         
--                   )
--                   -- &gt;&gt;&gt; A Right Outer Join that preserves the Previous result RTuples and joins with the budget table
--               :. (RJoin (TabL budget) Previous $ 
--                           JoinOn (tl tr -&gt;                                             
--                                       tl &lt;!&gt; "Category" == tr &lt;!&gt; "Category")
--                                   )
--                   )
--               :. (OrderBy [ ("TotalAmount", Desc)] $ From Previous))
--   </pre>
--   
--   <h3>A Column Mapping </h3>
--   
--   We will use the previous result (i.e., a table with expenses and
--   budget amounts per category of expenses), in order to create a derived
--   column to calculate the residual amount, with the help of a Column
--   Mapping Expression (<a>ColMappingExpr</a>).
--   
--   <h4>SQL</h4>
--   
--   <pre>
--   WITH exp
--   as (
--       SELECT "Category",  sum("Amount") AS "TotalAmount"
--       FROM expenses
--       GROUP BY  "Category"
--   )
--   SELECT  exp."Category", exp."TotalAmount", bdg."YearlyBudget", bdg."YearlyBudget" - exp."TotalAmount" AS "ResidualAmount"
--   FROM budget bdg RIGHT JOIN exp ON (bdg."Category" = exp."Category")
--   ORDER BY exp."TotalAmount" DESC        
--   </pre>
--   
--   <h4>Julius</h4>
--   
--   <pre>
--   juliusToRTable $           
--       EtlMapStart
--       :-&gt; (EtlR $
--               ROpStart  
--               :. (GroupBy ["Category"]  
--                           (AggOn [Sum "Amount" $ As "TotalAmount"] (From $ Tab expenses)) $ 
--                           GroupOn  (t1 t2 -&gt; t1 &lt;!&gt; "Category" == t2 &lt;!&gt; "Category")                                                                         
--                   )
--               :. (RJoin (TabL budget) Previous $ 
--                           JoinOn (tl tr -&gt;                                             
--                                       tl &lt;!&gt; "Category" == tr &lt;!&gt; "Category")
--                                   )
--                   )
--               :. (Select ["Category",  "TotalAmount", "YearlyBudget"] $ From Previous)
--               :. (OrderBy [ ("TotalAmount", Desc)] $ From Previous)
--           )
--           -- &gt;&gt;&gt; A Column Mapping to create a derived column ("ResidualAmount"")
--       :-&gt; (EtlC $
--               Source ["TotalAmount", "YearlyBudget"] $
--               Target ["ResidualAmount"] $
--               By ([totAmount, yearlyBudget] -&gt; [yearlyBudget - totAmount]) 
--                   (On Previous) DontRemoveSrc $ FilterBy (t -&gt; True)
--           )
--   </pre>
--   
--   <h3>Naming Intermediate Results in a Julius Expression</h3>
--   
--   In the following example, each named result is an autonomous
--   (intermediate) result, that can be accessed directly and we can handle
--   it as a distinct result (i.e.,<a>RTable</a>). So we can refer to such
--   a result in a subsequent expression, or we can print this result
--   separately, with the <a>printRTable</a> function etc. Each such named
--   result resembles a separate subquery in an SQL <tt>WITH</tt> clause.
--   
--   <h4>SQL</h4>
--   
--   <pre>
--   WITH detailYearTXTab
--   as (
--       SELECT "TxTimeStamp", "Category", "Description", "Amount","DebitCredit"
--       FROM txTab
--       WHERE   to_number(to_char("TxTimeStamp", <tt>YYYY</tt>)) &gt;= yearInput 
--               AND 
--               to_number(to_char("TxTimeStamp", <tt>YYYY</tt>)) &lt; yearInput + 1
--   ),
--   expGroupbyCategory
--   as (
--       SELECT "Category", sum ("Amount") AS "AmountSpent"
--       FROM detailYearTXTab
--       WHERE
--           "DebitCredit" = "D"
--       GROUP BY "Category"
--       ORDER BY 2 DESC
--   ),
--   revGroupbyCategory
--   as (
--       SELECT "Category", sum("Amount") AS "AmountReceived"
--       FROM detailYearTXTab
--       WHERE
--           "DebitCredit" = "C"
--       GROUP BY "Category"
--       ORDER BY 2 DESC
--   ),
--   ojoinedWithBudget
--   as(
--       SELECT "Category", "AmountSpent", YearlyBudget"
--       FROM budget bdg RIGHT JOIN expGroupbyCategory exp ON (bdg."Category" = exp."Category")
--   ),
--   calculatedFields
--   as(
--       SELECT "Category", "AmountSpent", "YearlyBudget", "YearlyBudget" - "AmountSpent" AS "ResidualAmount" 
--       FROM ojoinedWithBudget
--   )
--   SELECT  *
--   FROM calculatedFields
--   </pre>
--   
--   <h4>Julius</h4>
--   
--   <pre>
--   let
--       julExpr = 
--           -- 1. get detailed transactions of the year
--           :=&gt; NamedResult "detailYearTXtab" (EtlR $
--                   ROpStart
--                   -- keep RTuples only of the specified year
--                   :. (Filter (From $ Tab txTab) $    
--                           FilterBy (t -&gt;       rtime (t &lt;!&gt; "TxTimeStamp") &gt;= 
--                                                                           RTimestampVal {year = yearInput, month = 1, day = 1, hours24 = 0, minutes = 0, seconds = 0}
--                                                                       &amp;&amp;
--                                                                           rtime (t &lt;!&gt; "TxTimeStamp") &lt; 
--                                                                           RTimestampVal { year = yearInput + 1, 
--                                                                                           month = 1,
--                                                                                           day = 1, hours24 = 0, minutes = 0, seconds = 0})
--                       )
--                   -- keep only columns of interest
--                   :. (Select ["TxTimeStamp", "Category", "Description", "Amount","DebitCredit"] $ From Previous)
--                   :. (OrderBy [("TxTimeStamp", Asc)] $ From Previous)
--               )
--           -- 2. expenses group by category
--           :=&gt; NamedResult "expGroupbyCategory" (EtlR $
--                   ROpStart
--                       -- keep only the "debit" transactions
--                   :. (FilterBy    (From Previous) $
--                                   FilterBy (t -&gt; t &lt;!&gt; "DebitCredit" == "D")
--                       )                
--                   :. (GroupBy ["Category"] 
--                               (AggOn [Sum "Amount" $ As "AmountSpent" ]  $ From Previous) $ 
--                               GroupOn (t1 t2 -&gt;  t1 &lt;!&gt; "Category" == t2 &lt;!&gt; "Category")
--                       )
--                   :. (OrderBy [("AmountSpent", Desc)] $ From Previous)
--               )
--           -- 3. revenues group by category
--           :=&gt; NamedResult "revGroupbyCategory" (EtlR $
--                   ROpStart
--                       -- keep only the "credit" transactions
--                   :. (FilterBy    (From $ juliusToRTable $ takeNamedResult "detailYearTXtab" julExpr) $
--                                   FilterBy (t -&gt; t &lt;!&gt; "DebitCredit" == "C")
--                       )                
--                   :. (GroupBy ["Category"] 
--                               (AggOn [Sum "Amount" $ As "AmountReceived" ]  $ From Previous) $ 
--                               GroupOn (t1 t2 -&gt;  t1 &lt;!&gt; "Category" == t2 &lt;!&gt; "Category")
--                       )
--                   :. (OrderBy [("AmountReceived", Desc)] $ From Previous)
--               )
--           -- 3. Expenses Group By Category Outer joined with budget info
--           :=&gt; NamedResult "ojoinedWithBudget" (EtlR $
--                   ROpStart
--                   :. (RJoin (TabL budget) (Tab $ juliusToRTable $ takeNamedResult "expGroupbyCategory" julExpr) $ 
--                               JoinOn (tl tr -&gt;                                             
--                                           tl &lt;!&gt; "Category" == tr &lt;!&gt; "Category")
--                                       )
--                       )
--                   :. (Select ["Category",  "AmountSpent", "YearlyBudget"] $ From Previous)
--                   :. (OrderBy [ ("TotalAmount", Desc)] $ From Previous)
--               )
--           -- 4. A Column Mapping to create a derived column ("ResidualAmount")
--           :=&gt; NamedResult "calculatedFields"  (EtlC $
--                   Source ["AmountSpent", "YearlyBudget"] $
--                   Target ["ResidualAmount"] $
--                   By ([amountSpent, yearlyBudget] -&gt; [yearlyBudget - amountSpent]) 
--                       (On Previous) DontRemoveSrc $ FilterBy (t -&gt; True)
--               )
--   
--   -- 5. Print detail transactions
--   printRTable $ juliusToRTable $ takeNamedResult "detailYearTXtab" julExpr
--   
--   -- 6. Print Expenses by Category
--   printRTable $ juliusToRTable $ takeNamedResult "expGroupbyCategory" julExpr
--   
--   -- 7. Print Expenses with Budgeting Info and Residual Amount
--   printRTable $ juliusToRTable $ takeNamedResult "calculatedFields" julExpr
--   
--   -- equivalently
--   printRTable $ juliusToRTable julExpr
--   
--   -- 8. Print Revenues by Category
--   printRTable $ juliusToRTable $ takeNamedResult "revGroupbyCategory" julExpr
--   </pre>
--   
--   Explanation of each named result in the above example:
--   
--   <ul>
--   <li><i>"detailYearTXtab"</i> We retrieve the detail transactions of
--   the current year only and project the columns of interest. We order
--   result by transaction timestamp.</li>
--   <li><i>"expGroupbyCategory"</i> We filter the previous result
--   ("detailYearTXtab"), in order to get only the transactions
--   corresponding to expenses (<tt>t &lt;!&gt; "DebitCredit" == "D"</tt>)
--   and then we group by category, in order to get a total amount spent by
--   category of expenses.</li>
--   <li><i>"revGroupbyCategory"</i> We filter the "detailYearTXtab" result
--   (see the use of the <a>takeNamedResult</a> function in order to access
--   the "detailYearTXtab" intermediate result), in order to get only the
--   transactions corresponding to revenues (<tt>t &lt;!&gt; "DebitCredit"
--   == "C"</tt>) and then we group by category, in order to get a total
--   amount received by category of revenues.</li>
--   <li><i>"ojoinedWithBudget"</i> We do a right join of the grouped
--   expenses with the budget table. Again note the use of the
--   <a>takeNamedResult</a> function. The preserved <a>RTable</a> is the
--   grouped expenses ("expGroupbyCategory").</li>
--   <li><i>"calculatedFields"</i> Finally, we use a Column Mapping in
--   order to create a derived column, which holds the residual amount for
--   each category of expenses.</li>
--   </ul>
module Etl.Julius

-- | An ETL Mapping Expression is a <b>"Julius Expression"</b>. It is a
--   sequence of individual ETL Operation Expressions. Each such ETL
--   Operation "acts" on some input <a>RTable</a> and produces a new
--   "transformed" output <a>RTable</a>. The ETL Mapping connector
--   <a>:-&gt;</a> (as well as the <a>:=&gt;</a> connector) is left
--   associative because in a <a>ETLMappingExpr</a> operations are
--   evaluated from left to right (or top to bottom) A Named ETL Operation
--   Expression (<a>NamedMap</a>) is just an ETL Operation with a name, so
--   as to be able to reference this specific step in the chain of ETL
--   Operations. It is actually a <i>named intermediate result</i>, which
--   we can reference and use in other parts of our Julius expression
data ETLMappingExpr
EtlMapStart :: ETLMappingExpr
(:->) :: ETLMappingExpr -> ETLOpExpr -> ETLMappingExpr
(:=>) :: ETLMappingExpr -> NamedMap -> ETLMappingExpr
infixl 5 :->
infixl 5 :=>

-- | An ETL Operation Expression is either a Column Mapping Expression
--   (<a>ColMappingExpr</a>), or a Relational Operation Expression
--   (<a>ROpExpr</a>)
data ETLOpExpr
EtlC :: ColMappingExpr -> ETLOpExpr
EtlR :: ROpExpr -> ETLOpExpr

-- | A named intermediate result in a Julius expression
--   (<a>ETLMappingExpr</a>), which we can access via the
--   <a>takeNamedResult</a> function.
data NamedMap
NamedResult :: NamedResultName -> ETLOpExpr -> NamedMap

-- | The name of an intermediate result, used as a key for accessing this
--   result via the <a>takeNamedResult</a> function.
type NamedResultName = String

-- | A Column Mapping (<a>RColMapping</a>) is the main ETL/ELT construct
--   for defining a column-level transformation. Essentially with a Column
--   Mapping we can create one or more new (derived) column(s) (<i>Target
--   Columns</i>), based on an arbitrary transformation function
--   (<a>ColXForm</a>) with input parameters any of the existing columns
--   (<i>Source Columns</i>). So a <a>ColMappingExpr</a> is either empty,
--   or it defines the source columns, the target columns and the
--   transformation from source to target. Notes: * If a target-column has
--   the same name with a source-column and a <a>DontRemoveSrc</a>, or a
--   <a>RemoveSrc</a> has been specified, then the (target-column,
--   target-value) key-value pair, overwrites the corresponding
--   (source-column, source-value) key-value pair * The returned
--   <a>RTable</a> will include only the <a>RTuple</a>s that satisfy the
--   filter predicate specified in the <a>FilterBy</a> clause.
data ColMappingExpr
Source :: [ColumnName] -> ToColumn -> ColMappingExpr
ColMappingEmpty :: ColMappingExpr

-- | Defines the Target Columns of a Column Mapping Expression
--   (<a>ColMappingExpr</a>) and the column transformation function
--   (<a>ColXForm</a>).
data ToColumn
Target :: [ColumnName] -> ByFunction -> ToColumn

-- | Defines the column transformation function of a Column Mapping
--   Expression (<a>ColMappingExpr</a>), the input <a>RTable</a> that this
--   transformation will take place, an indicator (<a>RemoveSrcCol</a>) of
--   whether the Source Columns will be removed or not in the new
--   <a>RTable</a> that will be created after the Column Mapping is
--   executed and finally, an <a>RTuple</a> filter predicate
--   (<a>ByPred</a>) that defines the subset of <a>RTuple</a>s that this
--   Column Mapping will be applied to. If it must be applied to all
--   <a>RTuple</a>s, then for the last parameter (<a>ByPred</a>), we can
--   just provide the following <a>RPredicate</a>:
--   
--   <pre>
--   FilterBy (\_ -&gt; True) 
--   </pre>
data ByFunction
By :: ColXForm -> OnRTable -> RemoveSrcCol -> ByPred -> ByFunction

-- | Defines the <a>RTable</a> that the current operation will be applied
--   to.
data OnRTable
On :: TabExpr -> OnRTable

-- | A Table Expression defines the <a>RTable</a> on which the current ETL
--   Operation will be applied. If the <a>Previous</a> constructor is used,
--   then this <a>RTable</a> is the result of the previous ETL Operations
--   in the current Julius Expression (<a>ETLMappingExpr</a>)
data TabExpr
Tab :: RTable -> TabExpr
Previous :: TabExpr

-- | Indicator of whether the source column(s) in a Column Mapping will be
--   removed or not (used in <a>ColMappingExpr</a>) If a target-column has
--   the same name with a source-column and a <a>DontRemoveSrc</a> has been
--   specified, then the (target-column, target-value) key-value pair,
--   overwrites the corresponding (source-column, source-value) key-value
--   pair.
data RemoveSrcCol
RemoveSrc :: RemoveSrcCol
DontRemoveSrc :: RemoveSrcCol

-- | An <a>RTuple</a> predicate clause.
data ByPred
FilterBy :: RPredicate -> ByPred

-- | Predicate for Deletion Operation
data ByDelPred
Where :: RPredicate -> ByDelPred

-- | The Set sub-clause of an <a>Update</a> <a>RTable</a> clause. It
--   specifies each column to be updated along with the new value.
data SetColumns
Set :: [(ColumnName, RDataType)] -> SetColumns

-- | A Relational Operation Expression (<a>ROpExpr</a>) is a sequence of
--   one or more Relational Algebra Operations applied on a input
--   <a>RTable</a>. It is a sub-expression within a Julius Expression
--   (<a>ETLMappingExpr</a>) and we use it whenever we want to apply
--   relational algebra operations on an RTable (which might be the result
--   of previous operations in a Julius Expression). A Julius Expression
--   (<a>ETLMappingExpr</a>) can contain an arbitrary number of
--   <a>ROpExpr</a>s. The relational operation connector <a>:.</a> is left
--   associative because in a <a>ROpExpr</a> operations are evaluated from
--   left to right (or top to bottom).
data ROpExpr
ROpStart :: ROpExpr
(:.) :: ROpExpr -> RelationalOp -> ROpExpr
infixl 6 :.

-- | The Relational Operation (<a>RelationalOp</a>) is a Julius clause that
--   represents a <b>Relational Algebra Operation</b>.
data RelationalOp

-- | <a>RTuple</a> filtering clause (selection operation), based on an
--   arbitrary predicate function (<a>RPredicate</a>)
Filter :: FromRTable -> ByPred -> RelationalOp

-- | Column projection clause
Select :: [ColumnName] -> FromRTable -> RelationalOp

-- | Aggregate Operation clause
Agg :: Aggregate -> RelationalOp

-- | Group By clause, based on an arbitrary Grouping predicate function
--   (<a>RGroupPredicate</a>)
GroupBy :: [ColumnName] -> Aggregate -> GroupOnPred -> RelationalOp

-- | Inner Join clause, based on an arbitrary join predicate function - not
--   just equi-join - (<a>RJoinPredicate</a>)
Join :: TabLiteral -> TabExpr -> TabExprJoin -> RelationalOp

-- | Left Join clause, based on an arbitrary join predicate function - not
--   just equi-join - (<a>RJoinPredicate</a>)
LJoin :: TabLiteral -> TabExpr -> TabExprJoin -> RelationalOp

-- | Right Join clause, based on an arbitrary join predicate function - not
--   just equi-join - (<a>RJoinPredicate</a>)
RJoin :: TabLiteral -> TabExpr -> TabExprJoin -> RelationalOp

-- | Full Outer Join clause, based on an arbitrary join predicate function
--   - not just equi-join - (<a>RJoinPredicate</a>)
FOJoin :: TabLiteral -> TabExpr -> TabExprJoin -> RelationalOp

-- | Implements the semi-Join operation between two RTables (any type of
--   join predicate is allowed) It returns the <a>RTuple</a>s from the left
--   <a>RTable</a> that match with the right <a>RTable</a>. Note that if an
--   <a>RTuple</a> from the left <a>RTable</a> matches more than one
--   <a>RTuple</a>s from the right <a>RTable</a> the semi join operation
--   will return only a single <a>RTuple</a>.
SemiJoin :: TabLiteral -> TabExpr -> TabExprJoin -> RelationalOp
SemiJoinP :: TabExpr -> TabLiteral -> TabExprJoin -> RelationalOp

-- | Implements the anti-Join operation between two RTables (any type of
--   join predicate is allowed) It returns the <a>RTuple</a>s from the left
--   <a>RTable</a> that DONT match with the right <a>RTable</a>.
AntiJoin :: TabLiteral -> TabExpr -> TabExprJoin -> RelationalOp
AntiJoinP :: TabExpr -> TabLiteral -> TabExprJoin -> RelationalOp

-- | Intersection clause
Intersect :: TabLiteral -> TabExpr -> RelationalOp

-- | Union clause. Note this operation eliminates dublicate <a>RTuples</a>
Union :: TabLiteral -> TabExpr -> RelationalOp

-- | Union All clause. It is a Union operation without dublicate
--   <a>RTuple</a> elimination.
UnionAll :: TabLiteral -> TabExpr -> RelationalOp

-- | Minus clause (set Difference operation)
Minus :: TabLiteral -> TabExpr -> RelationalOp
MinusP :: TabExpr -> TabLiteral -> RelationalOp

-- | This is a generic unary operation on a RTable
--   (<a>UnaryRTableOperation</a>). It is used to define an arbitrary unary
--   operation on an <a>RTable</a>
GenUnaryOp :: OnRTable -> ByGenUnaryOperation -> RelationalOp

-- | This is a generic binary operation on a RTable
--   (<a>BinaryRTableOperation</a>). It is used to define an arbitrary
--   binary operation on an <a>RTable</a>
GenBinaryOp :: TabLiteral -> TabExpr -> ByGenBinaryOperation -> RelationalOp

-- | Order By clause.
OrderBy :: [(ColumnName, OrderingSpec)] -> FromRTable -> RelationalOp

-- | Delete operation. Deletes the <a>RTuple</a>s from an <a>RTable</a>
--   based on an <a>RPredicate</a>. Please note that this is an
--   <b>immutable</b> implementation of an <a>RTable</a> update. This
--   simply means that the delete operation returns a new <a>RTable</a>.
--   So, the original <a>RTable</a> remains unchanged and no deletion
--   in-place takes place whatsoever. Moreover, if we have multiple threads
--   deleting an <a>RTable</a>, due to immutability, each thread "sees" its
--   own copy of the <a>RTable</a> and thus there is no need for locking
--   the deleted <a>RTuple</a>s, as happens in a common RDBMS.
Delete :: FromRTable -> ByDelPred -> RelationalOp

-- | Update an <a>RTable</a>. Please note that this is an <b>immutable</b>
--   implementation of an <a>RTable</a> update. This simply means that the
--   update operation returns a new <a>RTable</a> that includes all the
--   <a>RTuple</a>s of the original <a>RTable</a>, both the ones that have
--   been updated and the others that have not. So, the original
--   <a>RTable</a> remains unchanged and no update in-place takes place
--   whatsoever. Moreover, if we have multiple threads updating an
--   <a>RTable</a>, due to immutability, each thread "sees" its own copy of
--   the <a>RTable</a> and thus there is no need for locking the updated
--   <a>RTuple</a>s, as happens in a common RDBMS.
Update :: TabExpr -> SetColumns -> ByPred -> RelationalOp

-- | Insert Operation. It can insert into an <a>RTable</a> a single
--   <a>RTuple</a> or a whole <a>RTable</a>. The latter is the equivalent
--   of an <tt>INSERT INTO SELECT</tt> clause in SQL. Since, an
--   <a>RTable</a> can be the result of a Julius expression (playing the
--   role of a subquery within the Insert clause, in this case). Please
--   note that this is an <b>immutable</b> implementation of an
--   <a>RTable</a> insert. This simply means that the insert operation
--   returns a new <a>RTable</a> and does not affect the original
--   <a>RTable</a>. Also note that the source and target <a>RTable</a>s
--   should have the same structure. By "structure", we mean that the
--   <a>ColumnName</a>s and the corresponding data types must match.
--   Essentially what we record in the <a>ColumnInfo</a> must be the same
--   for the two <a>RTable</a>s. Otherwise a
--   <a>ConflictingRTableStructures</a> exception will be thrown.
Insert :: IntoClause -> RelationalOp

-- | Upsert (Update+Insert, aka Merge) Operation. We provide a source
--   <a>RTable</a> and a matching condition (<a>RUpsertPredicate</a>) to
--   the <a>RTuple</a>s of the target <a>RTable</a>. An <a>RTuple</a> from
--   the target <a>RTable</a> might match to a single only <a>RTuple</a> in
--   the source <a>RTable</a>, or not match at all. If it is matched to
--   more than one <a>RTuple</a>s then an exception
--   (<a>UniquenessViolationInUpsert</a>)is thrown. When an <a>RTuple</a>
--   from the target <a>RTable</a> is matched to a source <a>RTuple</a>,
--   then the corresponding columns of the target <a>RTuple</a> are updated
--   with the new values provided in the source <a>RTuple</a>. This takes
--   place for the target <a>RTuple</a>s that match but also that satisfy
--   the input <a>RPredicate</a>. Thus we can restrict further with a
--   filter the <a>RTuple</a>s of the target <a>RTable</a> where the update
--   will take place. Finally, the source <a>RTuple</a>s that did not match
--   to the target <a>RTable</a>, are inserted (appended) to the target
--   <a>RTable</a>
--   
--   Please note that this is an <b>immutable</b> implementation of an
--   <a>RTable</a> upsert. This simply means that the upsert operation
--   returns a new <a>RTable</a> and does not affect the original
--   <a>RTable</a>. Also note that the source and target <a>RTable</a>s
--   should have the same structure. By "structure", we mean that the
--   <a>ColumnName</a>s and the corresponding data types must match.
--   Essentially what we record in the <a>ColumnInfo</a> must be the same
--   for the two <a>RTable</a>s. Otherwise a
--   <a>ConflictingRTableStructures</a> exception will be thrown.
--   
--   <pre>
--   An Example:
--   Source RTable: srcTab = 
--       Id  |   Msg         | Other
--       ----|---------------|-------
--       1   |   "updated"   |"a"    
--       2   |   "world2"    |"a"    
--       3   |   "inserted"  |"a"    
--   
--   Target RTable: trgTab = 
--       Id  |   Msg         | Other
--       ----|---------------|-------
--       1   |   "hello1"    |"b"    
--       2   |   "world1"    |"b"    
--       4   |   "old"       |"b"    
--       5   |   "hello"     |"b"    
--             
--   juliusToRTable $
--       EtlMapStart
--           :-&gt; (EtlR $
--                   ROpStart
--                   :.(Upsert $ 
--                       MergeInto (Tab trgTab) $
--                           Using (TabSrc srcTab) $
--                               MergeOn (RUpsertPredicate ["Id"] (\t1 t2 -&gt; t1 &lt;!&gt; "Id" == t2 &lt;!&gt; "Id")) $  -- merge condition: srcTab.Id == trgTab.Id
--                                   WhenMatchedThen $
--                                       UpdateCols ["Msg"] $
--                                           FilterBy (\t -&gt;    let
--                                                                 msg = case toText (t &lt;!&gt; "Msg") of
--                                                                   Just t -&gt; t
--                                                                   Nothing -&gt; pack ""
--                                                               in (take 5 msg) == (pack "hello")
--                                                    )  -- Msg like "hello%"
--                   )
--               )
--   
--   Result RTable: 
--       Id  |   Msg         | Other
--       ----|---------------|-------
--       1   |   "updated"   |"b"   -- Updated RTuple. Note that only column "Msg" has been overwritten, as per the UpdateCols subclause
--       2   |   "world1"    |"b"   -- Not affected due to FilterBy predicate
--       3   |   "inserted"  |"a"   -- Inserted RTuple
--       4   |   "old"       |"b"   -- Not affected due to MergeOn condition
--       5   |   "hello"     |"b"   -- Not affected due to MergeOn condition  
--   </pre>
Upsert :: MergeInto -> RelationalOp

-- | Resembles the "FROM" clause in SQL. It defines the <a>RTable</a> on
--   which the Relational Operation will be applied
data FromRTable
From :: TabExpr -> FromRTable

-- | An Aggregate Operation Clause
data Aggregate
AggOn :: [AggOp] -> FromRTable -> Aggregate

-- | These are the available aggregate operation clauses
data AggOp
Sum :: ColumnName -> AsColumn -> AggOp

-- | Count aggregation (no distinct)
Count :: ColumnName -> AsColumn -> AggOp

-- | Count distinct aggregation (i.e., <tt>count(distinct col)</tt> in
--   SQL). Returns the distinct number of values for this column.
CountDist :: ColumnName -> AsColumn -> AggOp

-- | Returns the number of <a>RTuple</a>s in the <a>RTable</a> (i.e.,
--   <tt>count(*)</tt> in SQL)
CountStar :: AsColumn -> AggOp
Min :: ColumnName -> AsColumn -> AggOp
Max :: ColumnName -> AsColumn -> AggOp

-- | Average aggregation
Avg :: ColumnName -> AsColumn -> AggOp

-- | String aggregation
StrAgg :: ColumnName -> AsColumn -> Delimiter -> AggOp

-- | A custom aggregate operation
GenAgg :: ColumnName -> AsColumn -> AggBy -> AggOp

-- | Defines the name of the column that will hold the aggregate operation
--   result. It resembles the "AS" clause in SQL.
data AsColumn
As :: ColumnName -> AsColumn

-- | Julius Clause to provide a custom aggregation function
data AggBy
AggBy :: AggFunction -> AggBy

-- | A grouping predicate clause. It defines an arbitrary function
--   (<tt>RGroupPRedicate</tt>), which drives when two <a>RTuple</a>s
--   should belong in the same group.
data GroupOnPred
GroupOn :: RGroupPredicate -> GroupOnPred

-- | This clause is used for expressions where we do not allow the use of
--   the Previous value
data TabLiteral
TabL :: RTable -> TabLiteral

-- | Join Predicate Clause. It defines when two <a>RTuple</a>s should be
--   paired.
data TabExprJoin
JoinOn :: RJoinPredicate -> TabExprJoin

-- | It is used to define an arbitrary unary operation on an <a>RTable</a>
data ByGenUnaryOperation
ByUnaryOp :: UnaryRTableOperation -> ByGenUnaryOperation

-- | It is used to define an arbitrary binary operation on an <a>RTable</a>
data ByGenBinaryOperation
ByBinaryOp :: BinaryRTableOperation -> ByGenBinaryOperation

-- | Insert Into subclause
data IntoClause
Into :: TabExpr -> InsertSource -> IntoClause

-- | Subclause on <a>Insert</a> clause. Defines the source of the insert
--   operation. The <tt>Values</tt> branch is used for inserting a singl
--   <a>RTuple</a>, while the <tt>RTuples</tt> branch is used for inserting
--   a whole <a>RTable</a>, typically derived as the result of a Julius
--   expression. The former is similar in concept with an <tt>INSERT INTO
--   VALUES</tt> SQL clause, and the latter is similar in concept with an
--   <tt>INSERT INTO SELECT</tt> SQL clause.
data InsertSource
Values :: ValuesClause -> InsertSource
RTuples :: TabSource -> InsertSource

-- | Subclause on <a>Insert</a> clause. Defines the source <a>RTuple</a> of
--   the insert operation.
type ValuesClause = [(ColumnName, RDataType)]

-- | This subclause refers to the source <a>RTable</a> that will feed an
--   <a>Insert</a> operation
data TabSource
TabSrc :: RTable -> TabSource

-- | Merge Into subclause
data MergeInto
MergeInto :: TabExpr -> MergeSource -> MergeInto

-- | Upsert source subclause (Using clause in <tt>SQL</tt>)
data MergeSource
Using :: TabSource -> MergeMatchCondition -> MergeSource

-- | Upsert matching condition subclause
data MergeMatchCondition
MergeOn :: RUpsertPredicate -> WhenMatched -> MergeMatchCondition

-- | When Matched subclause of Upsert
data WhenMatched
WhenMatchedThen :: UpdateColumns -> WhenMatched

-- | Update columns subclause of Upsert
data UpdateColumns
UpdateCols :: [ColumnName] -> ByPred -> UpdateColumns

-- | Evaluates (parses) the Julius exrpession and produces an
--   <a>ETLMapping</a>. The <a>ETLMapping</a> is an internal representation
--   of the Julius expression and one needs to combine it with the
--   <a>etl</a> function, in order to evaluate the Julius expression into
--   an <a>RTable</a>. This can be achieved directly with function
--   <a>juliusToRTable</a>
evalJulius :: ETLMappingExpr -> ETLMapping

-- | Pure code to evaluate the "ETL-logic" of a Julius expression and
--   generate the corresponding target RTable.
--   
--   The evaluation of a Julius expression (i.e., a <a>ETLMappingExpr</a>)
--   to an RTable is strict. It evaluates fully to Normal Form (NF) as
--   opposed to a lazy evaluation (i.e., only during IO), or evaluation to
--   a WHNF. This is for efficiency reasons (e.g., avoid space leaks and
--   excessive memory usage). It also has the impact that exceptions will
--   be thrown at the same line of code that <a>juliusToRTable</a> is
--   called. Thus one should wrap this call with a <a>catch</a> handler, or
--   use <a>eitherPrintRTable</a>, or <a>eitherPrintfRTable</a>, if one
--   wants to handle the exception gracefully.
--   
--   Example:
--   
--   <pre>
--   do 
--    catch (printRTable $ juliusToRTable $ &lt;a Julius expression&gt; )
--          (\e -&gt; putStrLn $ "There was an error in the Julius evaluation: " ++ (show (e::SomeException)) )
--   </pre>
--   
--   Or, similarly
--   
--   <pre>
--   do 
--    p &lt;- (eitherPrintRTable  printRTable $
--                             juliusToRTable $ &lt;a Julius expression&gt;                                 
--         ) :: IO (Either SomeException ())
--    case p of
--              Left exc -&gt; putStrLn $ "There was an error in the Julius evaluation: " ++ (show exc)
--              Right _  -&gt; return ()
--   </pre>
juliusToRTable :: ETLMappingExpr -> RTable

-- | Evaluate a Julius expression within the IO Monad. I.e., Effectful code
--   to evaluate the "ETL-logic" of a Julius expression and generate the
--   corresponding target RTable.
--   
--   The evaluation of a Julius expression (i.e., a <a>ETLMappingExpr</a>)
--   to an RTable is strict. It evaluates fully to Normal Form (NF) as
--   opposed to a lazy evaluation (i.e., only during IO), or evaluation to
--   a WHNF. This is for efficiency reasons (e.g., avoid space leaks and
--   excessive memory usage). It also has the impact that exceptions will
--   be thrown at the same line of code that <a>runJulius</a> is called.
--   Thus one should wrap this call with a <a>catch</a> handler, or use
--   <a>eitherRunJulius</a>, if he wants to handle the exception
--   gracefully.
--   
--   Example:
--   
--   <pre>
--   do 
--       result &lt;- catch (runJulius $  &lt;a Julius expression&gt;)
--                       (e -&gt; do 
--                                putStrLn $ "there was an error in Julius evaluation: " ++ (show (e::SomeException))
--                                return emptyRTable
--                       )
--   </pre>
runJulius :: ETLMappingExpr -> IO RTable

-- | Evaluate a Julius expression and return the corresponding target
--   <a>RTable</a> or an exception. One can define custom exceptions to be
--   thrown within a Julius expression. This function will catch any
--   exceptions that are instances of the <a>Exception</a> type class.
--   
--   The evaluation of a Julius expression (i.e., a <a>ETLMappingExpr</a>)
--   to an <a>RTable</a> is strict. It evaluates fully to Normal Form (NF)
--   as opposed to a lazy evaluation (i.e., only during IO), or evaluation
--   to a WHNF. This is for efficiency reasons (e.g., avoid space leaks and
--   excessive memory usage).
--   
--   Example:
--   
--   <pre>
--   do 
--       res &lt;- (eitherRunJulius $ &lt;a Julius expression&gt;) :: IO (Either SomeException RTable) 
--       resultRTab  &lt;- case res of
--                       Right t  -&gt; return t
--                       Left exc -&gt;  do 
--                                       putStrLn $ "there was an error in Julius evaluation: " ++ (show exc)
--                                       return emptyRTable
--   </pre>
eitherRunJulius :: Exception e => ETLMappingExpr -> IO (Either e RTable)

-- | Receives an input Julius expression, evaluates it to an ETL Mapping
--   (<a>ETLMapping</a>) and executes it, in order to return an
--   <a>RTabResult</a> containing an <a>RTable</a> storing the result of
--   the ETL Mapping, as well as the number of <a>RTuple</a>s returned
juliusToResult :: ETLMappingExpr -> RTabResult

-- | Evaluate a Julius expression within the IO Monad and return an
--   <a>RTabResult</a>.
runJuliusToResult :: ETLMappingExpr -> IO RTabResult

-- | Evaluate a Julius expression within the IO Monad and return either an
--   <a>RTabResult</a>, or an exception, in case of an error during
--   evaluation.
eitherRunJuliusToResult :: Exception e => ETLMappingExpr -> IO (Either e RTabResult)

-- | Generic ETL execution function. It receives a list of input (aka
--   "source") <a>RTable</a>s and an ETL function that produces a list of
--   output (aka "target") <a>RTable</a>s. The ETL function should embed
--   all the "transformation-logic" from the source <a>RTable</a>s to the
--   target <a>RTable</a>s.
runETL :: ([RTable] -> [RTable]) -> [RTable] -> IO [RTable]

-- | Generic ETL execution function that returns either the target list of
--   <a>RTable</a>s, or an exception in case of a problem during the ETL
--   code execution. It receives a list of input (aka "source")
--   <a>RTable</a>s and an ETL function that produces a list of output (aka
--   "target") <a>RTable</a>s. The ETL function should embed all the
--   "transformation-logic" from the source <a>RTable</a>s to the target
--   <a>RTable</a>s.
eitherRunETL :: Exception e => ([RTable] -> [RTable]) -> [RTable] -> IO (Either e [RTable])

-- | Returns a prefix of an ETLMappingExpr that matches a named
--   intermediate result. For example, below we show a Julius expression
--   where we define an intermediate named result called "myResult". This
--   result, is used at a later stage in this Julius expression, with the
--   use of the function takeNamedResult.
--   
--   <pre>
--   etlXpression = 
--               EtlMapStart
--               :-&gt; (EtlC $ ...)                
--               :=&gt; NamedResult "myResult" (EtlR $ ...) 
--               :-&gt; (EtlR $ ... )
--               :-&gt; (EtlR $
--                       ROpStart
--                       :.  (Minus 
--                               (TabL $ 
--                                   juliusToRTable $ takeNamedResult "myResult" etlXpression    --  THIS IS THE POINT WHERE WE USE THE NAMED RESULT!
--                               ) 
--                               (Previous))
--                   )
--   </pre>
--   
--   In the above Julius expression (etlXpresion) the "myResult" named
--   result equals to the prefix of the etlXpresion, up to the operation
--   (included) with the named result "myResult".
--   
--   <pre>
--   
--   takeNamedResult "myResult" etlXpression ==  EtlMapStart
--                                               :-&gt; (EtlC $ ...)                
--                                               :=&gt; NamedResult "myResult" (EtlR $ ...) 
--   </pre>
--   
--   Note that the julius expression is scanned from right to left and thus
--   it will return the longest prefix expression that matches the input
--   name
takeNamedResult :: NamedResultName -> ETLMappingExpr -> ETLMappingExpr

-- | Returns an <a>UnaryRTableOperation</a> (<a>RTable</a> -&gt;
--   <a>RTable</a>) that adds a surrogate key (SK) column to an
--   <a>RTable</a> and fills each row with a SK value. It primarily is
--   intended to be used within a Julius expression. For example:
--   
--   <pre>
--   GenUnaryOp (On Tab rtab1) $ ByUnaryOp (addSurrogateKeyJ <a>TxSK</a> 0)
--   </pre>
addSurrogateKeyJ :: Integral a => ColumnName -> a -> RTable -> RTable

-- | Returns a <a>BinaryRTableOperation</a> (<a>RTable</a> -&gt;
--   <a>RTable</a> -&gt; <a>RTable</a>) that Appends an <a>RTable</a> to a
--   target <a>RTable</a>. It is primarily intended to be used within a
--   Julius expression. For example:
--   
--   <pre>
--   GenBinaryOp (TabL rtab1) (Tab $ rtab2) $ ByBinaryOp appendRTableJ
--   </pre>
appendRTableJ :: RTable -> RTable -> RTable

-- | Returns an <a>ETLOperation</a> that adds a surrogate key (SK) column
--   to an <a>RTable</a> and fills each row with a SK value. This function
--   is only exposed for backward compatibility reasons. The recommended
--   function to use instead is <a>addSurrogateKeyJ</a>, which can be
--   embedded directly into a Julius expression as a
--   <a>UnaryRTableOperation</a>.
addSurrogateKey :: Integral a => ColumnName -> a -> ETLOperation

-- | Returns an <a>ETLOperation</a> that Appends an <a>RTable</a> to a
--   target <a>RTable</a> This function is only exposed for backward
--   compatibility reasons. The recommended function to use instead is
--   <a>appendRTableJ</a>, which can be embedded directly into a Julius
--   expression as a <a>BinaryRTableOperation</a>.
appendRTable :: ETLOperation


-- | This module implements the <a>RTabular</a> instance of the <a>CSV</a>
--   data type, i.e., implements the interface by which a CSV file can be
--   transformed to/from an <a>RTable</a>. It is required when we want to
--   do ETL/ELT over CSV files with the <a>DBFunctor</a> package (i.e.,
--   with the <b>Julius</b> EDSL for ETL/ELT found in the <a>Etl.Julius</a>
--   module).
--   
--   The minimum requirement for implementing an <a>RTabular</a> instance
--   for a data type is to implement the <a>toRTable</a> and
--   <a>fromRTable</a> functions. Apart from these two functions, this
--   module also exports functions for reading and writing <a>CSV</a> data
--   from/to CSV files. Also it supports all types of delimiters (not only
--   commas) and CSVs with or without headers. (see <a>CSVOptions</a>)
--   
--   For the <a>CSV</a> data type this module uses the Cassava library
--   (<a>Data.Csv</a>)
module RTable.Data.CSV

-- | Definition of a CSV file. Treating CSV data as opaque byte strings
newtype CSV
CSV :: Vector Row -> CSV
[csv] :: CSV -> Vector Row

-- | Definition of a CSV Row. Essentially a Row is just a Vector of
--   ByteString
type Row = Vector Column

-- | Definition of a CSV column.
type Column = Field

-- | Options for a CSV file (e.g., delimiter specification, header
--   specification etc.)
data CSVOptions
CSVOptions :: Char -> YesNo -> CSVOptions
[delimiter] :: CSVOptions -> Char
[hasHeader] :: CSVOptions -> YesNo

-- | Yes or No sum type
data YesNo
Yes :: YesNo
No :: YesNo

-- | reads a CSV file and returns a <a>CSV</a> data type (Treating CSV data
--   as opaque byte strings)
readCSV :: FilePath -> IO CSV

-- | reads a CSV file based on input options (delimiter and header option)
--   and returns a <a>CSV</a> data type (Treating CSV data as opaque byte
--   strings)
readCSVwithOptions :: CSVOptions -> FilePath -> IO CSV

-- | reads a CSV file and returns a lazy bytestring
readCSVFile :: FilePath -> IO ByteString

-- | write a <a>CSV</a> to a newly created csv file
writeCSV :: FilePath -> CSV -> IO ()

-- | write a CSV (bytestring) to a newly created csv file
writeCSVFile :: FilePath -> ByteString -> IO ()
toRTable :: RTabular a => RTableMData -> a -> RTable
fromRTable :: RTabular a => RTableMData -> RTable -> a

-- | print input <a>CSV</a> on screen
printCSV :: CSV -> IO ()

-- | print input CSV on screen
printCSVFile :: ByteString -> IO ()

-- | copy input csv file to specified output csv file
copyCSV :: FilePath -> FilePath -> IO ()

-- | selectNrows: Returns the first N rows from a CSV file
selectNrows :: Int -> CSV -> CSV

-- | Column projection on an input CSV file where desired columns are
--   defined by position (index) in the CSV.
projectByIndex :: [Int] -> CSV -> CSV

-- | O(1) First row
headCSV :: CSV -> Row

-- | O(1) Yield all but the first row without copying. The CSV may not be
--   empty.
tailCSV :: CSV -> CSV

-- | creates a <a>Header</a> (as defined in <a>Data.Csv</a>) from an
--   <a>RTable</a>
csvHeaderFromRtable :: RTable -> Header

-- | Exception to signify an error in decoding a CSV file into a <a>CSV</a>
--   data type
data CsvFileDecodingError
CsvFileDecodingError :: FilePath -> Text -> CsvFileDecodingError

-- | This exception signifies an error in parsing a <a>CSV</a>
--   <a>Column</a> to an <a>RDataType</a> value
data CSVColumnToRDataTypeError
CSVColumnToRDataTypeError :: ColumnName -> Text -> CSVColumnToRDataTypeError
instance GHC.Show.Show RTable.Data.CSV.CSVColumnToRDataTypeError
instance GHC.Classes.Eq RTable.Data.CSV.CSVColumnToRDataTypeError
instance GHC.Show.Show RTable.Data.CSV.CsvFileDecodingError
instance GHC.Classes.Eq RTable.Data.CSV.CsvFileDecodingError
instance GHC.Exception.Type.Exception RTable.Data.CSV.CSVColumnToRDataTypeError
instance GHC.Exception.Type.Exception RTable.Data.CSV.CsvFileDecodingError
instance RTable.Core.RTabular RTable.Data.CSV.CSV