Copyright	(c) Nikos Karagiannidis 2018
License	BSD3
Maintainer	nkarag@gmail.com
Stability	stable
Portability	POSIX
Safe Haskell	None
Language	Haskell2010

RTable.Core

Contents

The Relational Table Concept
- RTable Data Types
- RTable Metadata Data Types
Type Classes for "Tabular Data"
Relational Algebra Operations
RTable Functions
Modify RTable (DML)
Create/Alter RTable (DDL)
Metadata Functions
Exceptions
RTable IO Operations
- RTable Printing and Formatting

Description

This is the core module that implements the relational Table concept with the RTable data type. It defines all necessary data types like RTable and RTuple as well as all the basic relational algebra operations (selection -i.e., filter- , projection, inner/outer join, aggregation, grouping etc.) on RTables.

When to use this module

This module should be used whenever one has "tabular data" (e.g., some CSV files, or any type of data that can be an instance of the RTabular type class and thus define the toRTable and fromRTable functions) and wants to analyze them in-memory with the well-known relational algebra operations (selection, projection, join, groupby, aggregations etc) that lie behind SQL. This data analysis takes place within your haskell code, without the need to import the data into a database (database-less data processing) and the result can be turned into the original format (e.g., CSV) with a simple call to the fromRTable function.

RTable.Core gives you an interface for all common relational algebra operations, which are expressed as functions over the basic RTable data type. Of course, since each relational algebra operation is a function that returns a new RTable (immutability), one can compose these operations and thus express an arbitrary complex query. Immutability also holds for DML operations also (e.g., updateRTab). This means that any update on an RTable operates like a CREATE AS SELECT statement in SQL, creating a new RTable and not modifying an existing one.

Note that the recommended method in order to perform data analysis via relational algebra operations is to use the type-level Embedded Domain Specific Language (EDSL) Julius, defined in module Etl.Julius, which exports the RTable.Core module. This provides a standard way of expressing queries and is simpler for expressing more complex queries (with many relational algebra operations). Moreover it supports intermediate results (i.e., subqueries). Finally, if you need to implement some ETL/ELT data flows, that will use the relational operations defined in RTable.Core to analyze data but also to combine them with various Column Mappings (RColMapping), in order to achieve various data transformations, then Julius is the appropriate tool for this job.

See this Julius Tutorial

Overview

An RTable is logically a container of RTuples (similar to the concept of a Relation being a set of Tuples) and is the core data type in this module. The RTuple is a map of (Column-Name, Column-Value) pairs. A Column-Name is modeled with the ColumnName data type, while the Column-Value is modelled with the RDataType, which is a wrapper over the most common data types that one would expect to find in a column of a Table (e.g., integers, rational numbers, strings, dates etc.).

We said that the RTable is a container of RTuples and thus the RTable is a Monad! So one can write monadic code to implement RTable operations. For example:

   -- | Return an new RTable after modifying each RTuple of the input RTable.
   myRTableOperation :: RTable -> RTable
   myRTableOperation rtab = do
           rtup <- rtab
           let new_rtup = doStuff rtup
           return new_rtup
       where
           doStuff :: RTuple -> RTuple
           doStuff = ...  -- to be defined

Many different types of data can be turned into an RTable. For example, CSV data can be easily turn into an RTable via the toRTable function. Many other types of data could be represented as "tabular data" via the RTable data type, as long as they adhere to the interface posed by the RTabular type class. In other words, any data type that we want to convert into an RTable and vice-versa, must become an instance of the RTabular type class and thus define the basic toRTable and fromRTable functions.

An Example

In this example we read a CSV file with the use of the readCSV function from the RTable.Data.CSV module. Then, with the use of the toRTable function, implemented in the RTabular instance of the CSV data type, we convert the CSV file into an RTable. The data of the CSV file consist of metadata from an imaginary Oracle database and each row represents an entry for a table stored in this database, with information (i.e., columns) pertaining to the owner of the table, the tablespace name, the status of the table and various statistics, such as the number of rows and number of blocks.

In this example, we apply three "transformations" to the input data and we print the result after each one, with the use of the printfRTable function. The transfomrations are:

a limit operation, where we return the first N number of RTuples,
an RFilter operation that returns only the tables that start with a 'B', followed by a projection operation (RPrj)
an inner-join (RInJoin), where we pair the RTuples from the previous results based on a join predicate (RJoinPredicate): the tables that have been analyzed the same day

Finally, we store the results of the 2nd operation into a new CSV file, with the use of the fromRTable function implemented for the RTabular instance of the CSV data type.


import  RTable.Core
import  RTable.Data.CSV     (CSV, readCSV, toRTable)
import  Data.Text as T          (take, pack)

-- This is the input source table metadata
src_DBTab_MData :: RTableMData
src_DBTab_MData = 
    createRTableMData   (   "sourceTab"  -- table name
                            ,[  ("OWNER", Varchar)                                      -- Owner of the table
                                ,("TABLE_NAME", Varchar)                                -- Name of the table
                                ,("TABLESPACE_NAME", Varchar)                           -- Tablespace name
                                ,("STATUS",Varchar)                                     -- Status of the table object (VALID/IVALID)
                                ,("NUM_ROWS", Integer)                                  -- Number of rows in the table
                                ,("BLOCKS", Integer)                                    -- Number of Blocks allocated for this table
                                ,("LAST_ANALYZED", Timestamp "MMDDYYYY HH24:MI:SS")   -- Timestamp of the last time the table was analyzed (i.e., gathered statistics) 
                            ]
                        )
                        ["OWNER", "TABLE_NAME"] -- primary key
                        [] -- (alternative) unique keys


-- Result RTable metadata
result_tab_MData :: RTableMData
result_tab_MData = 
    createRTableMData   (   "resultTab"  -- table name
                            ,[  ("OWNER", Varchar)                                        -- Owner of the table
                                ,("TABLE_NAME", Varchar)                                  -- Name of the table
                                ,("LAST_ANALYZED", Timestamp "MMDDYYYY HH24:MI:SS")   -- Timestamp of the last time the table was analyzed (i.e., gathered statistics) 
                            ]
                        )
                        ["OWNER", "TABLE_NAME"] -- primary key
                        [] -- (alternative) unique keys


main :: IO()
main = do
     -- read source csv file
    srcCSV <- readCSV "./app/test-data.csv"

    putStrLn "\nHow many rows you want to print from the source table? :\n"
    n <- readLn :: IO Int    
    
    -- RTable A
    printfRTable (  -- define the order by which the columns will appear on screen. Use the default column formatting.
                    genRTupleFormat ["OWNER", "TABLE_NAME", "TABLESPACE_NAME", "STATUS", "NUM_ROWS", "BLOCKS", "LAST_ANALYZED"] genDefaultColFormatMap) $ 
                        limit n $ toRTable src_DBTab_MData srcCSV 

    putStrLn "\nThese are the tables that start with a "B":\n"    
    
    -- RTable B
    printfRTable ( genRTupleFormat ["OWNER", "TABLE_NAME","LAST_ANALYZED"] genDefaultColFormatMap) $ 
        tabs_start_with_B $ toRTable src_DBTab_MData srcCSV 
    
    putStrLn "\nThese are the tables that were analyzed the same day:\n"    
    
    -- RTable C = A InnerJoin B
    printfRTable ( genRTupleFormat ["OWNER", "TABLE_NAME", "LAST_ANALYZED", "OWNER_1", "TABLE_NAME_1", "LAST_ANALYZED_1"] genDefaultColFormatMap) $ 
        ropB  myJoin
                    (limit n $ toRTable src_DBTab_MData srcCSV) 
                    (tabs_start_with_B $ toRTable src_DBTab_MData srcCSV)

    -- save result of 2nd operation to CSV file
    writeCSV ".appresult-data.csv" $ 
                    fromRTable result_tab_MData $ 
                        tabs_start_with_B $ 
                            toRTable src_DBTab_MData srcCSV 

    where
        -- Return RTuples with a table_name starting with a B
        tabs_start_with_B :: RTable -> RTable
        tabs_start_with_B rtab = (ropU myProjection) . (ropU myFilter) $ rtab
            where
                -- Create a Filter Operation to return only RTuples with table_name starting with a B
                myFilter = RFilter (    t ->   let 
                                                    tbname = case toText (t <!> "TABLE_NAME") of
                                                                Just t -> t
                                                                Nothing -> pack ""
                                                in (T.take 1 tbname) == (pack "B")
                                    )
                -- Create a Projection Operation that projects only two columns
                myProjection = RPrj ["OWNER", "TABLE_NAME", "LAST_ANALYZED"]

        -- Create an Inner Join for tables analyzed in the same day
        myJoin :: ROperation
        myJoin = RInJoin (  t1 t2 -> 
                                let
                                    RTime {rtime = RTimestampVal {year = y1, month = m1, day = d1, hours24 = hh1, minutes = mm1, seconds = ss1}} = t1<!>"LAST_ANALYZED"
                                    RTime {rtime = RTimestampVal {year = y2, month = m2, day = d2, hours24 = hh2, minutes = mm2, seconds = ss2}} = t2<!>"LAST_ANALYZED"
                                in y1 == y2 && m1 == m2 && d1 == d2
                        )

And here is the output:

:l .srcRTable/example.hs
:set -XOverloadedStrings
main

How many rows you want to print from the source table? :

10
---------------------------------------------------------------------------------------------------------------------------------
OWNER           TABLE_NAME                        TABLESPACE_NAME     STATUS     NUM_ROWS     BLOCKS     LAST_ANALYZED
~~~~~           ~~~~~~~~~~                        ~~~~~~~~~~~~~~~     ~~~~~~     ~~~~~~~~     ~~~~~~     ~~~~~~~~~~~~~
APEX_030200     SYS_IOT_OVER_71833                SYSAUX              VALID      0            0          06082012 16:22:36
APEX_030200     WWV_COLUMN_EXCEPTIONS             SYSAUX              VALID      3            3          06082012 16:22:33
APEX_030200     WWV_FLOWS                         SYSAUX              VALID      10           3          06082012 22:01:21
APEX_030200     WWV_FLOWS_RESERVED                SYSAUX              VALID      0            0          06082012 16:22:33
APEX_030200     WWV_FLOW_ACTIVITY_LOG1$           SYSAUX              VALID      1            29         07202012 19:07:57
APEX_030200     WWV_FLOW_ACTIVITY_LOG2$           SYSAUX              VALID      14           29         07202012 19:07:57
APEX_030200     WWV_FLOW_ACTIVITY_LOG_NUMBER$     SYSAUX              VALID      1            3          07202012 19:08:00
APEX_030200     WWV_FLOW_ALTERNATE_CONFIG         SYSAUX              VALID      0            0          06082012 16:22:33
APEX_030200     WWV_FLOW_ALT_CONFIG_DETAIL        SYSAUX              VALID      0            0          06082012 16:22:33
APEX_030200     WWV_FLOW_ALT_CONFIG_PICK          SYSAUX              VALID      37           3          06082012 16:22:33


10 rows returned
---------------------------------------------------------------------------------------------------------------------------------

These are the tables that start with a B:

-------------------------------------------------------------
OWNER      TABLE_NAME                LAST_ANALYZED
~~~~~      ~~~~~~~~~~                ~~~~~~~~~~~~~
DBSNMP     BSLN_BASELINES            04152018 16:14:51
DBSNMP     BSLN_METRIC_DEFAULTS      06082012 16:06:41
DBSNMP     BSLN_STATISTICS           04152018 17:41:33
DBSNMP     BSLN_THRESHOLD_PARAMS     06082012 16:06:41
SYS        BOOTSTRAP$                04142014 13:53:43


5 rows returned
-------------------------------------------------------------

These are the tables that were analyzed the same day:

-------------------------------------------------------------------------------------------------------------------------------------
OWNER           TABLE_NAME                     LAST_ANALYZED           OWNER_1     TABLE_NAME_1              LAST_ANALYZED_1
~~~~~           ~~~~~~~~~~                     ~~~~~~~~~~~~~           ~~~~~~~     ~~~~~~~~~~~~              ~~~~~~~~~~~~~~~
APEX_030200     SYS_IOT_OVER_71833             06082012 16:22:36     DBSNMP      BSLN_THRESHOLD_PARAMS     06082012 16:06:41
APEX_030200     SYS_IOT_OVER_71833             06082012 16:22:36     DBSNMP      BSLN_METRIC_DEFAULTS      06082012 16:06:41
APEX_030200     WWV_COLUMN_EXCEPTIONS          06082012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06082012 16:06:41
APEX_030200     WWV_COLUMN_EXCEPTIONS          06082012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06082012 16:06:41
APEX_030200     WWV_FLOWS                      06082012 22:01:21     DBSNMP      BSLN_THRESHOLD_PARAMS     06082012 16:06:41
APEX_030200     WWV_FLOWS                      06082012 22:01:21     DBSNMP      BSLN_METRIC_DEFAULTS      06082012 16:06:41
APEX_030200     WWV_FLOWS_RESERVED             06082012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06082012 16:06:41
APEX_030200     WWV_FLOWS_RESERVED             06082012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06082012 16:06:41
APEX_030200     WWV_FLOW_ALTERNATE_CONFIG      06082012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06082012 16:06:41
APEX_030200     WWV_FLOW_ALTERNATE_CONFIG      06082012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06082012 16:06:41
APEX_030200     WWV_FLOW_ALT_CONFIG_DETAIL     06082012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06082012 16:06:41
APEX_030200     WWV_FLOW_ALT_CONFIG_DETAIL     06082012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06082012 16:06:41
APEX_030200     WWV_FLOW_ALT_CONFIG_PICK       06082012 16:22:33     DBSNMP      BSLN_THRESHOLD_PARAMS     06082012 16:06:41
APEX_030200     WWV_FLOW_ALT_CONFIG_PICK       06082012 16:22:33     DBSNMP      BSLN_METRIC_DEFAULTS      06082012 16:06:41


14 rows returned
-------------------------------------------------------------------------------------------------------------------------------------

Check the output CSV file

$ head ./app/result-data.csv
OWNER,TABLE_NAME,LAST_ANALYZED
DBSNMP,BSLN_BASELINES,04152018 16:14:51
DBSNMP,BSLN_METRIC_DEFAULTS,06082012 16:06:41
DBSNMP,BSLN_STATISTICS,04152018 17:41:33
DBSNMP,BSLN_THRESHOLD_PARAMS,06082012 16:06:41
SYS,BOOTSTRAP$,04142014 13:53:43

Synopsis

type RTable = Vector RTuple
type RTuple = HashMap ColumnName RDataType
data RDataType
- = RInt {
  - rint :: !Integer
  }
- | RText {
  - rtext :: !Text
  }
- | RDate {
  - rdate :: !Text
  - dtformat :: !Text
  }
- | RTime {
  - rtime :: !RTimestamp
  }
- | RDouble {
  - rdouble :: !Double
  }
- | Null
data RTimestamp = RTimestampVal {
- year :: !Int
- month :: !Int
- day :: !Int
- hours24 :: !Int
- minutes :: !Int
- seconds :: !Int
}
data RTableMData = RTableMData {
- rtname :: RTableName
- rtuplemdata :: RTupleMData
- pkColumns :: [ColumnName]
- uniqueKeys :: [[ColumnName]]
}
type RTupleMData = (HashMap ColumnOrder ColumnName, HashMap ColumnName ColumnInfo)
data ColumnInfo = ColumnInfo {
- name :: ColumnName
- dtype :: ColumnDType
}
type ColumnOrder = Int
type Name = Text
type ColumnName = Name
type RTableName = Name
data ColumnDType
- = UknownType
- | Integer
- | Varchar
- | Date Text
- | Timestamp Text
- | Double
type Delimiter = String
class RTabular a where
- toRTable :: RTableMData -> a -> RTable
- fromRTable :: RTableMData -> RTable -> a
data ROperation
- = ROperationEmpty
- | RUnion
- | RInter
- | RDiff
- | RPrj {
  - colPrjList :: [ColumnName]
  }
- | RFilter {
  - fpred :: RPredicate
  }
- | RInJoin {
  - jpred :: RJoinPredicate
  }
- | RLeftJoin {
  - jpred :: RJoinPredicate
  }
- | RRightJoin {
  - jpred :: RJoinPredicate
  }
- | RSemiJoin {
  - jpred :: RJoinPredicate
  }
- | RAntiJoin {
  - jpred :: RJoinPredicate
  }
- | RAggregate {
  - aggList :: [RAggOperation]
  }
- | RGroupBy {
  - gpred :: RGroupPredicate
  - aggList :: [RAggOperation]
  - colGrByList :: [ColumnName]
  }
- | RCombinedOp {
  - rcombOp :: UnaryRTableOperation
  }
- | RBinOp {
  - rbinOp :: BinaryRTableOperation
  }
- | ROrderBy {
  - colOrdList :: [(ColumnName, OrderingSpec)]
  }
type UnaryRTableOperation = RTable -> RTable
type BinaryRTableOperation = RTable -> RTable -> RTable
data RAggOperation = RAggOperation {
- sourceCol :: ColumnName
- targetCol :: ColumnName
- aggFunc :: RTable -> RTuple
}
type AggFunction = ColumnName -> RTable -> RDataType
raggGenericAgg :: AggFunction -> ColumnName -> ColumnName -> RAggOperation
raggSum :: ColumnName -> ColumnName -> RAggOperation
raggCount :: ColumnName -> ColumnName -> RAggOperation
raggCountDist :: ColumnName -> ColumnName -> RAggOperation
raggCountStar :: ColumnName -> RAggOperation
raggAvg :: ColumnName -> ColumnName -> RAggOperation
raggMax :: ColumnName -> ColumnName -> RAggOperation
raggMin :: ColumnName -> ColumnName -> RAggOperation
raggStrAgg :: ColumnName -> ColumnName -> Delimiter -> RAggOperation
type RPredicate = RTuple -> Bool
type RGroupPredicate = RTuple -> RTuple -> Bool
type RJoinPredicate = RTuple -> RTuple -> Bool
data RUpsertPredicate = RUpsertPredicate {
- matchCols :: [ColumnName]
- matchPred :: RTuple -> RTuple -> Bool
}
runUnaryROperation :: ROperation -> RTable -> RTable
ropU :: ROperation -> RTable -> RTable
runUnaryROperationRes :: ROperation -> RTable -> RTabResult
ropUres :: ROperation -> RTable -> RTabResult
runBinaryROperation :: ROperation -> RTable -> RTable -> RTable
ropB :: ROperation -> RTable -> RTable -> RTable
runBinaryROperationRes :: ROperation -> RTable -> RTable -> RTabResult
ropBres :: ROperation -> RTable -> RTable -> RTabResult
type RTuplesRet = Sum Int
type RTabResult = Writer RTuplesRet RTable
rtabResult :: (RTable, RTuplesRet) -> RTabResult
runRTabResult :: RTabResult -> (RTable, RTuplesRet)
execRTabResult :: RTabResult -> RTuplesRet
rtuplesRet :: Int -> RTuplesRet
getRTuplesRet :: RTuplesRet -> Int
(.) :: (b -> c) -> (a -> b) -> a -> c
(<=<) :: Monad m => (b -> m c) -> (a -> m b) -> a -> m c
runRfilter :: RPredicate -> RTable -> RTable
f :: RPredicate -> RTable -> RTable
runInnerJoinO :: RJoinPredicate -> RTable -> RTable -> RTable
iJ :: RJoinPredicate -> RTable -> RTable -> RTable
runLeftJoin :: RJoinPredicate -> RTable -> RTable -> RTable
lJ :: RJoinPredicate -> RTable -> RTable -> RTable
runRightJoin :: RJoinPredicate -> RTable -> RTable -> RTable
rJ :: RJoinPredicate -> RTable -> RTable -> RTable
runFullOuterJoin :: RJoinPredicate -> RTable -> RTable -> RTable
foJ :: RJoinPredicate -> RTable -> RTable -> RTable
sJ :: RJoinPredicate -> RTable -> RTable -> RTable
runSemiJoin :: RJoinPredicate -> RTable -> RTable -> RTable
aJ :: RJoinPredicate -> RTable -> RTable -> RTable
runAntiJoin :: RJoinPredicate -> RTable -> RTable -> RTable
joinRTuples :: RTuple -> RTuple -> RTuple
runUnion :: RTable -> RTable -> RTable
runUnionAll :: RTable -> RTable -> RTable
u :: RTable -> RTable -> RTable
runIntersect :: RTable -> RTable -> RTable
i :: RTable -> RTable -> RTable
runDiff :: RTable -> RTable -> RTable
d :: RTable -> RTable -> RTable
runProjection :: [ColumnName] -> RTable -> RTable
runProjectionMissedHits :: [ColumnName] -> RTable -> RTable
p :: [ColumnName] -> RTable -> RTable
runAggregation :: [RAggOperation] -> RTable -> RTable
rAgg :: [RAggOperation] -> RTable -> RTable
runGroupBy :: RGroupPredicate -> [RAggOperation] -> [ColumnName] -> RTable -> RTable
rG :: RGroupPredicate -> [RAggOperation] -> [ColumnName] -> RTable -> RTable
groupNoAggList :: RGroupPredicate -> [ColumnName] -> RTable -> [RTable]
groupNoAgg :: RGroupPredicate -> [ColumnName] -> RTable -> RTable
runOrderBy :: [(ColumnName, OrderingSpec)] -> RTable -> RTable
rO :: [(ColumnName, OrderingSpec)] -> RTable -> RTable
runCombinedROp :: (RTable -> RTable) -> RTable -> RTable
rComb :: (RTable -> RTable) -> RTable -> RTable
data IgnoreDefault
- = Ignore
- | NotIgnore
decodeRTable :: ColumnName -> RDataType -> RDataType -> RDataType -> IgnoreDefault -> RTable -> RTable
decodeColValue :: ColumnName -> RDataType -> RDataType -> RDataType -> IgnoreDefault -> RTuple -> RDataType
toRTimestamp :: String -> String -> RTimestamp
createRTimestamp :: String -> String -> RTimestamp
toUTCTime :: RTimestamp -> UTCTime
fromUTCTime :: UTCTime -> RTimestamp
rTimestampToRText :: String -> RTimestamp -> RDataType
stdTimestampFormat :: String
stdDateFormat :: [Char]
instrRText :: RDataType -> RDataType -> Maybe Int
instr :: Eq a => [a] -> [a] -> Maybe Int
instrText :: Text -> Text -> Maybe Int
rdtappend :: RDataType -> RDataType -> RDataType
stripRText :: RDataType -> RDataType
removeCharAroundRText :: Char -> RDataType -> RDataType
isText :: RDataType -> Bool
nvlRTable :: ColumnName -> RDataType -> RTable -> RTable
nvlRTuple :: ColumnName -> RDataType -> RTuple -> RTuple
isNullRTuple :: RTuple -> Bool
isNull :: RDataType -> Bool
isNotNull :: RDataType -> Bool
nvl :: RDataType -> RDataType -> RDataType
nvlColValue :: ColumnName -> RDataType -> RTuple -> RDataType
isRTabEmpty :: RTable -> Bool
headRTup :: RTable -> RTuple
limit :: Int -> RTable -> RTable
isRTupEmpty :: RTuple -> Bool
getRTupColValue :: ColumnName -> RTuple -> RDataType
rtupLookup :: ColumnName -> RTuple -> Maybe RDataType
rtupLookupDefault :: RDataType -> ColumnName -> RTuple -> RDataType
(<!>) :: RTuple -> ColumnName -> RDataType
(<!!>) :: RTuple -> ColumnName -> Maybe RDataType
rtableToList :: RTable -> [RTuple]
concatRTab :: [RTable] -> RTable
rtupleToList :: RTuple -> [(ColumnName, RDataType)]
toListRDataType :: RTupleMData -> RTuple -> [RDataType]
toText :: RDataType -> Maybe Text
fromText :: Text -> RDataType
rtabMap :: (RTuple -> RTuple) -> RTable -> RTable
rtabFoldr' :: (RTuple -> RTable -> RTable) -> RTable -> RTable -> RTable
rtabFoldl' :: (RTable -> RTuple -> RTable) -> RTable -> RTable -> RTable
rtupleMap :: (RDataType -> RDataType) -> RTuple -> RTuple
rtupleMapWithKey :: (ColumnName -> RDataType -> RDataType) -> RTuple -> RTuple
rdatatypeFoldr' :: (RTuple -> RDataType -> RDataType) -> RDataType -> RTable -> RDataType
rdatatypeFoldl' :: (RDataType -> RTuple -> RDataType) -> RDataType -> RTable -> RDataType
insertAppendRTab :: RTuple -> RTable -> RTable
insertPrependRTab :: RTuple -> RTable -> RTable
insertRTabToRTab :: RTable -> RTable -> RTable
deleteRTab :: RPredicate -> RTable -> RTable
updateRTab :: [(ColumnName, RDataType)] -> RPredicate -> RTable -> RTable
upsertRTab :: RTable -> RUpsertPredicate -> [ColumnName] -> RPredicate -> RTable -> RTable
updateRTuple :: ColumnName -> RDataType -> RTuple -> RTuple
upsertRTuple :: ColumnName -> RDataType -> RTuple -> RTuple
emptyRTable :: RTable
createSingletonRTable :: RTuple -> RTable
rtableFromList :: [RTuple] -> RTable
addColumn :: ColumnName -> RDataType -> RTable -> RTable
removeColumn :: ColumnName -> RTable -> RTable
emptyRTuple :: RTuple
createNullRTuple :: [ColumnName] -> RTuple
createRTuple :: [(ColumnName, RDataType)] -> RTuple
rtupleFromList :: [(ColumnName, RDataType)] -> RTuple
createRDataType :: Typeable a => a -> RDataType
createRTableMData :: (RTableName, [(ColumnName, ColumnDType)]) -> [ColumnName] -> [[ColumnName]] -> RTableMData
getColumnNamesFromRTab :: RTable -> [ColumnName]
getColumnNamesFromRTuple :: RTuple -> [ColumnName]
getColumnInfoFromRTab :: RTable -> [ColumnInfo]
getColumnInfoFromRTuple :: RTuple -> [ColumnInfo]
getTheType :: RDataType -> ColumnDType
listOfColInfoRDataType :: [ColumnInfo] -> RTuple -> [(ColumnInfo, RDataType)]
toListColumnName :: RTupleMData -> [ColumnName]
toListColumnInfo :: RTupleMData -> [ColumnInfo]
rtabsSameStructure :: RTable -> RTable -> Bool
rtuplesSameStructure :: RTuple -> RTuple -> Bool
getUniqueColumnNamesAfterJoin :: [ColumnName] -> [ColumnName] -> [ColumnName]
data ColumnDoesNotExist = ColumnDoesNotExist ColumnName
data ConflictingRTableStructures = ConflictingRTableStructures String
data EmptyInputStringsInToRTimestamp = EmptyInputStringsInToRTimestamp String String
data UnsupportedTimeStampFormat = UnsupportedTimeStampFormat String
data UniquenessViolationInUpsert = UniquenessViolationInUpsert String
printRTable :: RTable -> IO ()
eitherPrintRTable :: Exception e => (RTable -> IO ()) -> RTable -> IO (Either e ())
printfRTable :: RTupleFormat -> RTable -> IO ()
eitherPrintfRTable :: Exception e => (RTupleFormat -> RTable -> IO ()) -> RTupleFormat -> RTable -> IO (Either e ())
data RTupleFormat = RTupleFormat {
- colSelectList :: [ColumnName]
- colFormatMap :: ColFormatMap
}
type ColFormatMap = HashMap ColumnName FormatSpecifier
data FormatSpecifier
- = DefaultFormat
- | Format String
data OrderingSpec
- = Asc
- | Desc
genRTupleFormat :: [ColumnName] -> ColFormatMap -> RTupleFormat
genRTupleFormatDefault :: RTupleFormat
genColFormatMap :: [(ColumnName, FormatSpecifier)] -> ColFormatMap
genDefaultColFormatMap :: ColFormatMap

The Relational Table Concept

RTable Data Types

type RTable = Vector RTuple Source #

Definition of the Relational Table entity An RTable is a "container" of RTuples.

type RTuple = HashMap ColumnName RDataType Source #

Definition of the Relational Tuple. An RTuple is implemented as a HashMap of (ColumnName, RDataType) pairs. This ensures fast access of the column value by column name. Note that this implies that the RTuple CANNOT have more than one columns with the same name (i.e. hashmap key) and more importantly that it DOES NOT have a fixed order of columns, as it is usual in RDBMS implementations. This gives us the freedom to perform column change operations very fast. The only place were we need fixed column order is when we try to load an RTable from a fixed-column structure such as a CSV file. For this reason, we have embedded the notion of a fixed column-order in the RTuple metadata. See RTupleMData.

data RDataType Source #

Definition of the Relational Data Type. This is the data type of the values stored in each RTable. This is a strict data type, meaning whenever we evaluate a value of type RDataType, there must be also evaluated all the fields it contains.

Constructors

RInt
Fields rint :: !Integer
RText
Fields rtext :: !Text
RDate
Fields rdate :: !Text dtformat :: !Text e.g., "DD/MM/YYYY"
RTime
Fields rtime :: !RTimestamp
RDouble
Fields rdouble :: !Double
Null

Instances

Eq RDataType Source #	We need to explicitly specify equation of RDataType due to SQL NULL logic (i.e., anything compared to NULL returns false): `Null == _ = False, _ == Null = False, Null /= _ = False, _ /= Null = False.` IMPORTANT NOTE: Of course this means that anywhere in your code where you have something like this: `x == Null or x /= Null,` will always return False and thus it is futile to do this comparison. You have to use the is `isNull` function instead.
Instance details Defined in RTable.Core Methods (==) :: RDataType -> RDataType -> Bool # (/=) :: RDataType -> RDataType -> Bool #
Fractional RDataType Source #	In order to be able to use (/) with RDataType
Instance details Defined in RTable.Core Methods (/) :: RDataType -> RDataType -> RDataType # recip :: RDataType -> RDataType # fromRational :: Rational -> RDataType #
Num RDataType Source #
Instance details Defined in RTable.Core Methods (+) :: RDataType -> RDataType -> RDataType # (-) :: RDataType -> RDataType -> RDataType # (*) :: RDataType -> RDataType -> RDataType # negate :: RDataType -> RDataType # abs :: RDataType -> RDataType # signum :: RDataType -> RDataType # fromInteger :: Integer -> RDataType #
Ord RDataType Source #
Instance details Defined in RTable.Core Methods compare :: RDataType -> RDataType -> Ordering # (<) :: RDataType -> RDataType -> Bool # (<=) :: RDataType -> RDataType -> Bool # (>) :: RDataType -> RDataType -> Bool # (>=) :: RDataType -> RDataType -> Bool # max :: RDataType -> RDataType -> RDataType # min :: RDataType -> RDataType -> RDataType #
Read RDataType Source #
Instance details Defined in RTable.Core Methods readsPrec :: Int -> ReadS RDataType # readList :: ReadS [RDataType] # readPrec :: ReadPrec RDataType # readListPrec :: ReadPrec [RDataType] #
Show RDataType Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> RDataType -> ShowS # show :: RDataType -> String # showList :: [RDataType] -> ShowS #
Generic RDataType Source #
Instance details Defined in RTable.Core Associated Types type Rep RDataType :: Type -> Type # Methods from :: RDataType -> Rep RDataType x # to :: Rep RDataType x -> RDataType #
NFData RDataType Source #	In order to be able to force full evaluation up to Normal Form (NF) https://www.fpcomplete.com/blog/2017/09/all-about-strictness
Instance details Defined in RTable.Core Methods rnf :: RDataType -> () #
type Rep RDataType Source #
Instance details Defined in RTable.Core type Rep RDataType = D1 (MetaData "RDataType" "RTable.Core" "DBFunctor-0.1.1.0-7Lx0zfYxtD0CRZBriX9SCp" False) ((C1 (MetaCons "RInt" PrefixI True) (S1 (MetaSel (Just "rint") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Integer)) :+: (C1 (MetaCons "RText" PrefixI True) (S1 (MetaSel (Just "rtext") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Text)) :+: C1 (MetaCons "RDate" PrefixI True) (S1 (MetaSel (Just "rdate") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Text) :*: S1 (MetaSel (Just "dtformat") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Text)))) :+: (C1 (MetaCons "RTime" PrefixI True) (S1 (MetaSel (Just "rtime") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 RTimestamp)) :+: (C1 (MetaCons "RDouble" PrefixI True) (S1 (MetaSel (Just "rdouble") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Double)) :+: C1 (MetaCons "Null" PrefixI False) (U1 :: Type -> Type))))

data RTimestamp Source #

Basic data type to represent time. This is a strict data type, meaning whenever we evaluate a value of type RTimestamp, there must be also evaluated all the fields it contains.

Constructors

RTimestampVal
Fields year :: !Int month :: !Int day :: !Int hours24 :: !Int minutes :: !Int seconds :: !Int

Instances

Eq RTimestamp Source #
Instance details Defined in RTable.Core Methods (==) :: RTimestamp -> RTimestamp -> Bool # (/=) :: RTimestamp -> RTimestamp -> Bool #
Ord RTimestamp Source #
Instance details Defined in RTable.Core Methods compare :: RTimestamp -> RTimestamp -> Ordering # (<) :: RTimestamp -> RTimestamp -> Bool # (<=) :: RTimestamp -> RTimestamp -> Bool # (>) :: RTimestamp -> RTimestamp -> Bool # (>=) :: RTimestamp -> RTimestamp -> Bool # max :: RTimestamp -> RTimestamp -> RTimestamp # min :: RTimestamp -> RTimestamp -> RTimestamp #
Read RTimestamp Source #
Instance details Defined in RTable.Core Methods readsPrec :: Int -> ReadS RTimestamp # readList :: ReadS [RTimestamp] # readPrec :: ReadPrec RTimestamp # readListPrec :: ReadPrec [RTimestamp] #
Show RTimestamp Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> RTimestamp -> ShowS # show :: RTimestamp -> String # showList :: [RTimestamp] -> ShowS #
Generic RTimestamp Source #
Instance details Defined in RTable.Core Associated Types type Rep RTimestamp :: Type -> Type # Methods from :: RTimestamp -> Rep RTimestamp x # to :: Rep RTimestamp x -> RTimestamp #
NFData RTimestamp Source #	In order to be able to force full evaluation up to Normal Form (NF)
Instance details Defined in RTable.Core Methods rnf :: RTimestamp -> () #
type Rep RTimestamp Source #
Instance details Defined in RTable.Core type Rep RTimestamp = D1 (MetaData "RTimestamp" "RTable.Core" "DBFunctor-0.1.1.0-7Lx0zfYxtD0CRZBriX9SCp" False) (C1 (MetaCons "RTimestampVal" PrefixI True) ((S1 (MetaSel (Just "year") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int) :: (S1 (MetaSel (Just "month") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int) :: S1 (MetaSel (Just "day") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int))) :: (S1 (MetaSel (Just "hours24") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int) :: (S1 (MetaSel (Just "minutes") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int) :*: S1 (MetaSel (Just "seconds") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int)))))

RTable Metadata Data Types

data RTableMData Source #

Metadata for an RTable

Constructors

RTableMData
Fields rtname :: RTableName Name of the `RTable` rtuplemdata :: RTupleMData Tuple-level metadata other metadata pkColumns :: [ColumnName] Primary Key uniqueKeys :: [[ColumnName]] List of unique keys i.e., each sublist is a unique key column combination

Instances

Eq RTableMData Source #
Instance details Defined in RTable.Core Methods (==) :: RTableMData -> RTableMData -> Bool # (/=) :: RTableMData -> RTableMData -> Bool #
Show RTableMData Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> RTableMData -> ShowS # show :: RTableMData -> String # showList :: [RTableMData] -> ShowS #

type RTupleMData = (HashMap ColumnOrder ColumnName, HashMap ColumnName ColumnInfo) Source #

Basic Metadata of an RTuple. The RTuple metadata are accessed through a HashMap ColumnName ColumnInfo structure. I.e., for each column of the RTuple, we access the ColumnInfo structure to get Column-level metadata. This access is achieved by ColumnName. However, in order to provide the "impression" of a fixed column order per tuple (see RTuple definition), we provide another HashMap, the HashMap ColumnOrder ColumnName. So in the follwoing example, if we want to access the RTupleMData tupmdata ColumnInfo by column order, (assuming that we have N columns) we have to do the following:

     (snd tupmdata)!((fst tupmdata)!0)
     (snd tupmdata)!((fst tupmdata)!1)
     ...
     (snd tupmdata)!((fst tupmdata)!(N-1))

In the same manner in order to access the column of an RTuple (e.g., tup) by column order, we do the following:

     tup!((fst tupmdata)!0)
     tup!((fst tupmdata)!1)
     ...
     tup!((fst tupmdata)!(N-1))

data ColumnInfo Source #

Basic metadata for a column of an RTuple

Constructors

ColumnInfo
Fields name :: ColumnName dtype :: ColumnDType

Instances

Eq ColumnInfo Source #
Instance details Defined in RTable.Core Methods (==) :: ColumnInfo -> ColumnInfo -> Bool # (/=) :: ColumnInfo -> ColumnInfo -> Bool #
Show ColumnInfo Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> ColumnInfo -> ShowS # show :: ColumnInfo -> String # showList :: [ColumnInfo] -> ShowS #

type ColumnOrder = Int Source #

type Name = Text Source #

Definition of the Name type

type ColumnName = Name Source #

Definition of the Column Name

type RTableName = Name Source #

Definition of the Table Name

data ColumnDType Source #

This is used only for metadata purposes (see ColumnInfo). The actual data type of a value is an RDataType The Text component of Date and Timestamp data constructors is the date format e.g., "DD/MM/YYYY", "DD/MM/YYYY HH24:MI:SS"

Constructors

UknownType
Integer
Varchar
Date Text
Timestamp Text
Double

Instances

Eq ColumnDType Source #
Instance details Defined in RTable.Core Methods (==) :: ColumnDType -> ColumnDType -> Bool # (/=) :: ColumnDType -> ColumnDType -> Bool #
Show ColumnDType Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> ColumnDType -> ShowS # show :: ColumnDType -> String # showList :: [ColumnDType] -> ShowS #

type Delimiter = String Source #

Type Classes for "Tabular Data"

class RTabular a where Source #

Basic class to represent a data type that can be turned into an RTable. It implements the concept of "tabular data"

Methods

toRTable :: RTableMData -> a -> RTable Source #

fromRTable :: RTableMData -> RTable -> a Source #

Instances

RTabular CSV Source #	CSV data are "Tabular" data thus implement the `RTabular` interface
Instance details Defined in RTable.Data.CSV Methods toRTable :: RTableMData -> CSV -> RTable Source # fromRTable :: RTableMData -> RTable -> CSV Source #

Relational Algebra Operations

Operations Data Types

data ROperation Source #

Definition of Relational Algebra operations. These are the valid operations between RTables

Constructors

ROperationEmpty
RUnion	Union
RInter	Intersection
RDiff	Difference
RPrj	Projection
Fields colPrjList :: [ColumnName]
RFilter	Filter operation (an `RPredicate` can be any function of the signature `RTuple -> Bool` so it is much more powerful than a typical SQL filter expression, which is a boolean expression of comparison operators)
Fields fpred :: RPredicate
RInJoin	Inner Join (any type of join predicate allowed. Any function with a signature of the form: `RTuple -> RTuple -> Bool` is a valid join predicate. I.e., a function which returns `True` when two `RTuples` must be paired)
Fields jpred :: RJoinPredicate
RLeftJoin	Left Outer Join
Fields jpred :: RJoinPredicate
RRightJoin	Right Outer Join
Fields jpred :: RJoinPredicate
RSemiJoin	Semi-Join
Fields jpred :: RJoinPredicate
RAntiJoin	Anti-Join
Fields jpred :: RJoinPredicate
RAggregate	Performs aggregation operations on specific columns and returns a singleton RTable
Fields aggList :: [RAggOperation] list of aggregates
RGroupBy	A Group By operation The SQL equivalent is: `SELECT colGrByList, aggList FROM... GROUP BY colGrByList` Note that compared to SQL, we can have a more generic grouping predicate (i.e., when two `RTuple`s should belong in the same group) than just the equality of values on the common columns between two `RTuple`s. Also note, that in the case of an aggregation without grouping (equivalent to a single-group group by), then the grouping predicate should be: `_ _ -> True`
Fields gpred :: RGroupPredicate the grouping predicate aggList :: [RAggOperation] list of aggregates colGrByList :: [ColumnName] the Group By list of columns
RCombinedOp	A combination of unary `ROperation`s e.g., `(p plist).(f pred) (i.e., RPrj . RFilter)` , in the form of an `RTable -> RTable function.` In this sense we can also include a binary operation (e.g. join), if we partially apply the join to one `RTable`, e.g., (ij jpred rtab) . (p plist) . (f pred)
Fields rcombOp :: UnaryRTableOperation
RBinOp	A generic binary `ROperation`.
Fields rbinOp :: BinaryRTableOperation
ROrderBy	Order the `RTuple`s of the `RTable` acocrding to the specified list of Columns. First column in the input list has the highest priority in the sorting order.
Fields colOrdList :: [(ColumnName, OrderingSpec)]

type UnaryRTableOperation = RTable -> RTable Source #

A generic unary operation on a RTable

type BinaryRTableOperation = RTable -> RTable -> RTable Source #

A generic binary operation on RTable

data RAggOperation Source #

This data type represents all possible aggregate operations over an RTable. Examples are : Sum, Count, Average, Min, Max but it can be any other "aggregation". The essential property of an aggregate operation is that it acts on an RTable (or on a group of RTuples - in the case of the RGroupBy operation) and produces a single RTuple.

An aggregate operation is applied on a specific column (source column) and the aggregated result will be stored in the target column. It is important to understand that the produced aggregated RTuple is different from the input RTuples. It is a totally new RTuple, that will consist of the aggregated column(s) (and the grouping columns in the case of an RGroupBy).

Constructors

RAggOperation
Fields sourceCol :: ColumnName Source column targetCol :: ColumnName Target column aggFunc :: RTable -> RTuple here we define the aggegate function to be applied on an RTable

Available Aggregate Operations

type AggFunction = ColumnName -> RTable -> RDataType Source #

Aggregation Function type. An aggregation function receives as input a source column (i.e., a ColumnName) of a source RTable and returns an aggregated value, which is the result of the aggregation on the values of the source column.

raggGenericAgg Source #

Arguments

:: AggFunction	custom aggregation function
-> ColumnName	source column
-> ColumnName	target column
-> RAggOperation

Returns an RAggOperation with a custom aggregation function provided as input

raggSum Source #

Arguments

:: ColumnName	source column
-> ColumnName	target column
-> RAggOperation

The Sum aggregate operation

raggCount Source #

Arguments

:: ColumnName	source column
-> ColumnName	target column
-> RAggOperation

The Count aggregate operation Count aggregation (no distinct)

raggCountDist Source #

Arguments

:: ColumnName	source column
-> ColumnName	target column
-> RAggOperation

The CountDist aggregate operation Count distinct aggregation (i.e., count(distinct col) in SQL). Returns the distinct number of values for this column.

raggCountStar Source #

Arguments

:: ColumnName	target column to save the result aggregated value
-> RAggOperation

The CountStar aggregate operation Returns the number of RTuples in the RTable (i.e., count(*) in SQL)

raggAvg Source #

Arguments

:: ColumnName	source column
-> ColumnName	target column
-> RAggOperation

The Average aggregate operation

raggMax Source #

Arguments

:: ColumnName	source column
-> ColumnName	target column
-> RAggOperation

The Max aggregate operation

raggMin Source #

Arguments

:: ColumnName	source column
-> ColumnName	target column
-> RAggOperation

The Min aggregate operation

raggStrAgg Source #

Arguments

:: ColumnName	source column
-> ColumnName	target column
-> Delimiter	delimiter string
-> RAggOperation

The StrAgg aggregate operation This is known as "string_agg"" in Postgresql and "listagg" in Oracle. It aggregates the values of a text RDataType column with a specified delimiter

Predicates

type RPredicate = RTuple -> Bool Source #

A Predicate. It defines an arbitrary condition over the columns of an RTuple. It is used primarily in the filter RFilter operation and used in the filter function f.

type RGroupPredicate = RTuple -> RTuple -> Bool Source #

The Group By Predicate It defines the condition for two RTuples to be included in the same group.

type RJoinPredicate = RTuple -> RTuple -> Bool Source #

The Join Predicate. It defines when two RTuples should be paired.

data RUpsertPredicate Source #

The Upsert Predicate. It defines when two RTuples should be paired in a merge operation. The matching predicate must be applied on a specific set of matching columns. The source RTable in the Upsert operation must return a unique set of RTuples, if grouped by this set of matching columns. Otherwise an exception (UniquenessViolationInUpsert) is thrown.

Constructors

RUpsertPredicate
Fields matchCols :: [ColumnName] matchPred :: RTuple -> RTuple -> Bool

Operation Execution

runUnaryROperation Source #

Arguments

:: ROperation	input ROperation
-> RTable	input RTable
-> RTable	output RTable

Execute a Unary ROperation

ropU :: ROperation -> RTable -> RTable Source #

ropU operator executes a unary ROperation. A short name for the runUnaryROperation function

runUnaryROperationRes Source #

Arguments

:: ROperation	input ROperation
-> RTable	input RTable
-> RTabResult	output: Result of operation

Execute a Unary ROperation and return an RTabResult

ropUres :: ROperation -> RTable -> RTabResult Source #

ropUres operator executes a unary ROperation. A short name for the runUnaryROperationRes function

runBinaryROperation Source #

Arguments

:: ROperation	input ROperation
-> RTable	input RTable1
-> RTable	input RTable2
-> RTable	output RTabl

Execute a Binary ROperation

ropB :: ROperation -> RTable -> RTable -> RTable Source #

ropB operator executes a binary ROperation. A short name for the runBinaryROperation function

runBinaryROperationRes Source #

Arguments

:: ROperation	input ROperation
-> RTable	input RTable1
-> RTable	input RTable2
-> RTabResult	output: Result of operation

Execute a Binary ROperation and return an RTabResult

ropBres :: ROperation -> RTable -> RTable -> RTabResult Source #

ropBres operator executes a binary ROperation. A short name for the runBinaryROperationRes function

Operation Result

type RTuplesRet = Sum Int Source #

Number of RTuples returned by an RTable operation

type RTabResult = Writer RTuplesRet RTable Source #

RTabResult is the result of an RTable operation and is a Writer Monad, that includes the new RTable, as well as the number of RTuples returned by the operation.

rtabResult Source #

Arguments

:: (RTable, RTuplesRet)	input pair
-> RTabResult	output Writer Monad

Creates an RTabResult (i.e., a Writer Monad) from a result RTable and the number of RTuples that it returned

runRTabResult :: RTabResult -> (RTable, RTuplesRet) Source #

Returns the info "stored" in the RTabResult Writer Monad

execRTabResult :: RTabResult -> RTuplesRet Source #

Returns the "log message" in the RTabResult Writer Monad, which is the number of returned RTuples

rtuplesRet :: Int -> RTuplesRet Source #

Creates an RTuplesRet type

getRTuplesRet :: RTuplesRet -> Int Source #

Return the number embedded in the RTuplesRet data type

Operation Composition

An Example of Operation Composition

>>> -- define a simple RTable with four RTuples of a single column "col1"
>>> let tab1 = rtableFromList [rtupleFromList [("col1", RInt 1)], rtupleFromList [("col1", RInt 2)], rtupleFromList [("col1", RInt 3)], rtupleFromList [("col1", RInt 4)] ]

>>> printRTable tab1

   col1
   ~~~~
   1
   2
   3
   4


   4 rows returned
   ---------

>>> -- define a filter operation col1 > 2
>>> let rop1 = RFilter (\t-> t<!>"col1" > 2)

>>> -- define another filter operation col1 > 3
>>> let rop2 = RFilter (\t-> t<!>"col1" > 3)

>>> -- Composition of RTable operations via (.) (rop1 returns 2 RTuples and rop2 returns 1 RTuple)
>>> printRTable $ (ropU rop2) . (ropU rop1) $ tab1

   col1
   ~~~~
   4


   1 row returned
   ---------

>>> -- Composition of RTabResult operations via (<=<) (Note: that the result includes the sum of the returned RTuples in each operation, i.e., 2+1 = 3)
>>> execRTabResult $ (ropUres rop2) <=< (ropUres rop1) $ tab1
Sum {getSum = 3}
>>> printRTable $ fst.runRTabResult $ (ropUres rop2) <=< (ropUres rop1) $ tab1

   col1
   ~~~~
   4


   1 row returned
   ---------

(.) :: (b -> c) -> (a -> b) -> a -> c infixr 9 #

Function composition.

(<=<) :: Monad m => (b -> m c) -> (a -> m b) -> a -> m c infixr 1 #

Right-to-left composition of Kleisli arrows. (>=>), with the arguments flipped.

Note how this operator resembles function composition (.):

(.)   ::            (b ->   c) -> (a ->   b) -> a ->   c
(<=<) :: Monad m => (b -> m c) -> (a -> m b) -> a -> m c

RTable Functions

Relational Algebra Functions

runRfilter :: RPredicate -> RTable -> RTable Source #

Executes an RFilter operation

f :: RPredicate -> RTable -> RTable Source #

Filter (i.e. selection operator). A short name for the runRFilter function

runInnerJoinO :: RJoinPredicate -> RTable -> RTable -> RTable Source #

Implements an Inner Join operation between two RTables (any type of join predicate is allowed) This Inner Join implementation follows Oracle DB's convention for common column names. When we have two tuples t1 and t2 with a common column name (lets say "Common"), then the resulting tuple after a join will be "Common", "Common_1", so a "_1" suffix is appended. The tuple from the left table by convention retains the original column name. So "Column_1" is the column from the right table. If "Column_1" already exists, then "Column_2" is used.

iJ :: RJoinPredicate -> RTable -> RTable -> RTable Source #

RTable Inner Join Operator. A short name for the runInnerJoinO function

runLeftJoin :: RJoinPredicate -> RTable -> RTable -> RTable Source #

Implements a Left Outer Join operation between two RTables (any type of join predicate is allowed), i.e., the rows of the left RTable will be preserved. Note that when dublicate keys encountered that is, since the underlying structure for an RTuple is a Data.HashMap.Strict, only one value per key is allowed. So in the context of joining two RTuples the value of the left RTuple on the common key will be prefered.

Implements a Left Outer Join operation between two RTables (any type of join predicate is allowed), i.e., the rows of the left RTable will be preserved. A Left Join : tabLeft LEFT JOIN tabRight ON joinPred where tabLeft is the preserving table can be defined as: the Union between the following two RTables:

The result of the inner join: tabLeft INNER JOIN tabRight ON joinPred
The rows from the preserving table (tabLeft) that DONT satisfy the join condition, enhanced with the columns of tabRight returning Null values.

The common columns will appear from both tables but only the left table column's will retain their original name.

lJ :: RJoinPredicate -> RTable -> RTable -> RTable Source #

RTable Left Outer Join Operator. A short name for the runLeftJoin function

runRightJoin :: RJoinPredicate -> RTable -> RTable -> RTable Source #

Implements a Right Outer Join operation between two RTables (any type of join predicate is allowed), i.e., the rows of the right RTable will be preserved. A Right Join : tabLeft RIGHT JOIN tabRight ON joinPred where tabRight is the preserving table can be defined as: the Union between the following two RTables:

The result of the inner join: tabLeft INNER JOIN tabRight ON joinPred
The rows from the preserving table (tabRight) that DONT satisfy the join condition, enhanced with the columns of tabLeft returning Null values.

The common columns will appear from both tables but only the right table column's will retain their original name.

rJ :: RJoinPredicate -> RTable -> RTable -> RTable Source #

RTable Right Outer Join Operator. A short name for the runRightJoin function

runFullOuterJoin :: RJoinPredicate -> RTable -> RTable -> RTable Source #

Implements a Full Outer Join operation between two RTables (any type of join predicate is allowed) A full outer join is the union of the left and right outer joins respectively. The common columns will appear from both tables but only the left table column's will retain their original name (just by convention).

foJ :: RJoinPredicate -> RTable -> RTable -> RTable Source #

Implements a Right Outer Join operation between two RTables (any type of join predicate is allowed) i.e., the rows of the right RTable will be preserved. Note that when dublicate keys encountered that is, since the underlying structure for an RTuple is a Data.HashMap.Strict, only one value per key is allowed. So in the context of joining two RTuples the value of the right RTuple on the common key will be prefered.

RTable Full Outer Join Operator. A short name for the runFullOuterJoin function

sJ :: RJoinPredicate -> RTable -> RTable -> RTable Source #

RTable semi-join operator. A short name for the runSemiJoin function

runSemiJoin :: RJoinPredicate -> RTable -> RTable -> RTable Source #

Implements the semi-Join operation between two RTables (any type of join predicate is allowed) It returns the RTuples from the left RTable that match with the right RTable. Note that if an RTuple from the left RTable matches more than one RTuples from the right RTable the semi join operation will return only a single RTuple.

aJ :: RJoinPredicate -> RTable -> RTable -> RTable Source #

RTable anti-join operator. A short name for the runAntiJoin function

runAntiJoin :: RJoinPredicate -> RTable -> RTable -> RTable Source #

Implements the anti-Join operation between two RTables (any type of join predicate is allowed) It returns the RTuples from the left RTable that DONT match with the right RTable.

joinRTuples :: RTuple -> RTuple -> RTuple Source #

Joins two RTuples into one. In this join we follow Oracle DB's convention when joining two tuples with some common column names. When we have two tuples t1 and t2 with a common column name (lets say Common), then the resulitng tuple after a join will be Common, Common_1, so a "_1" suffix is appended. The tuple from the left table by convention retains the original column name. So Column_1 is the column from the right table. If Column_1 already exists, then Column_2 is used.

runUnion :: RTable -> RTable -> RTable Source #

Implements the union of two RTables as a union of two lists (see List). Duplicates, and elements of the first list, are removed from the the second list, but if the first list contains duplicates, so will the result

Implements the union of two RTables. Note that dublicate RTuple elimination takes places.

runUnionAll :: RTable -> RTable -> RTable Source #

Implements the union-all of two RTables. I.e., a union without dublicate RTuple elimination. Runs in O(m+n).

u :: RTable -> RTable -> RTable Source #

RTable Union Operator. A short name for the runUnion function

runIntersect :: RTable -> RTable -> RTable Source #

Implements the intersection of two RTables

i :: RTable -> RTable -> RTable Source #

RTable Intersection Operator. A short name for the runIntersect function

runDiff :: RTable -> RTable -> RTable Source #

Implements the set Difference of two RTables as the diff of two lists (see List).

d :: RTable -> RTable -> RTable Source #

RTable Difference Operator. A short name for the runDiff function

runProjection Source #

Arguments

:: [ColumnName]	list of column names to be included in the final result RTable
-> RTable
-> RTable

Implements RTable projection operation. If a column name does not exist, then an empty RTable is returned.

runProjectionMissedHits Source #

Arguments

:: [ColumnName]	list of column names to be included in the final result RTable
-> RTable
-> RTable

Implements RTable projection operation. If a column name does not exist, then the returned RTable includes this column with a Null value. This projection implementation allows missed hits.

p :: [ColumnName] -> RTable -> RTable Source #

RTable Projection operator. A short name for the runProjection function

runAggregation Source #

Arguments

:: [RAggOperation]	Input Aggregate Operations
-> RTable	Input RTable
-> RTable	Output singleton RTable

Implements the aggregation operation on an RTable It aggregates the specific columns in each AggOperation and returns a singleton RTable i.e., an RTable with a single RTuple that includes only the agg columns and their aggregated value.

rAgg :: [RAggOperation] -> RTable -> RTable Source #

Aggregation Operator. A short name for the runAggregation function

runGroupBy Source #

Arguments

:: RGroupPredicate	Grouping predicate, in order to form the groups of RTuples (it defines when two RTuples should be included in the same group)
-> [RAggOperation]	Aggregations to be applied on specific columns
-> [ColumnName]	List of grouping column names (GROUP BY clause in SQL) We assume that all RTuples in the same group have the same value in these columns
-> RTable	input RTable
-> RTable	output RTable

Implements the GROUP BY operation over an RTable.

rG :: RGroupPredicate -> [RAggOperation] -> [ColumnName] -> RTable -> RTable Source #

Group By Operator. A short name for the runGroupBy function

groupNoAggList Source #

Arguments

:: RGroupPredicate	Grouping predicate, in order to form the groups of `RTuple`s (it defines when two `RTuple`s should be included in the same group)
-> [ColumnName]	List of grouping column names (GROUP BY clause in SQL) We assume that all RTuples in the same group have the same value in these columns
-> RTable	input `RTable`
-> [RTable]	output list of `RTable`s where each one corresponds to a group

Implement a grouping operation over an RTable. No aggregation takes place. It returns the individual groups as separate RTables in a list. In total the initial set of RTuples is retained. If an empty RTable is provided as input, then a ["empty RTable"] is returned.

groupNoAgg Source #

Arguments

:: RGroupPredicate	Grouping predicate, in order to form the groups of `RTuple`s (it defines when two `RTuple`s should be included in the same group)
-> [ColumnName]	List of grouping column names (GROUP BY clause in SQL) We assume that all `RTuple`s in the same group have the same value in these columns
-> RTable	input `RTable`
-> RTable	output `RTable`

Implement a grouping operation over an RTable. No aggregation takes place. The output RTable has exactly the same RTuples, as the input, but these are grouped based on the input grouping predicate. If an empty RTable is provided as input, then an empty RTable is returned.

runOrderBy Source #

Arguments

:: [(ColumnName, OrderingSpec)]	Input ordering specification
-> RTable	Input RTable
-> RTable	Output RTable

Implements the ORDER BY operation. First column in the input list has the highest priority in the sorting order We treat Null as the maximum value (anything compared to Null is smaller). This way Nulls are send at the end (i.e., "Nulls Last" in SQL parlance). This is for Asc ordering. For Desc ordering, we have the opposite. Nulls go first and so anything compared to Null is greater. @ SQL example with q as (select case when level < 4 then level else NULL end c1 -- , level c2 from dual connect by level < 7 ) select * from q order by c1

C1 ---- 1 2 3 Null Null Null

with q as (select case when level < 4 then level else NULL end c1 -- , level c2 from dual connect by level < 7 ) select * from q order by c1 desc

rO :: [(ColumnName, OrderingSpec)] -> RTable -> RTable Source #

Order By Operator. A short name for the runOrderBy function

runCombinedROp Source #

Arguments

:: (RTable -> RTable)	input combined RTable operation
-> RTable	input RTable that the input function will be applied to
-> RTable	output RTable

runCombinedROp: A Higher Order function that accepts as input a combination of unary ROperations e.g., (p plist).(f pred) expressed in the form of a function (RTable -> Rtable) and applies this function to the input RTable. In this sense we can also include a binary operation (e.g. join), if we partially apply the join to one RTable e.g., (ij jpred rtab) . (p plist) . (f pred)

rComb :: (RTable -> RTable) -> RTable -> RTable Source #

A short name for the runCombinedROp function

Decoding

data IgnoreDefault Source #

Constructors

Ignore
NotIgnore

Instances

Eq IgnoreDefault Source #
Instance details Defined in RTable.Core Methods (==) :: IgnoreDefault -> IgnoreDefault -> Bool # (/=) :: IgnoreDefault -> IgnoreDefault -> Bool #
Show IgnoreDefault Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> IgnoreDefault -> ShowS # show :: IgnoreDefault -> String # showList :: [IgnoreDefault] -> ShowS #

decodeRTable Source #

Arguments

:: ColumnName	ColumnName key
-> RDataType	Search value
-> RDataType	Return value
-> RDataType	Default value
-> IgnoreDefault	Ignore default indicator
-> RTable	input RTable
-> RTable

It receives an RTable, a search value and a default value. It returns a new RTable which is identical to the source one but for each RTuple, for the specified column: if the search value was found then the specified Return Value is returned else the default value is returned (if the ignore indicator is not set), otherwise (if the ignore indicator is set), it returns the existing value for the column for each RTuple. If you pass an empty RTable, then it returns an empty RTable Throws a ColumnDoesNotExist exception, if the column does not exist

decodeColValue Source #

Arguments

:: ColumnName	ColumnName key
-> RDataType	Search value
-> RDataType	Return value
-> RDataType	Default value
-> IgnoreDefault	Ignore default indicator
-> RTuple	input RTuple
-> RDataType

It receives an RTuple and lookups the value at a specfic column name. Then it compares this value with the specified search value. If it is equal to the search value then it returns the specified Return Value. If not, then it returns the specified default Value, if the ignore indicator is not set, otherwise (if the ignore indicator is set) it returns the existing value. If you pass an empty RTuple, then it returns Null. Throws a ColumnDoesNotExist exception, if this map contains no mapping for the key.

Date/Time

toRTimestamp Source #

Arguments

:: String	Format string e.g., "DD/MM/YYYY HH:MI:SS"
-> String	Timestamp string
-> RTimestamp

Returns an RTimestamp from an input String and a format String.

Valid format patterns are:

For year: YYYY, e.g., "0001", "2018"
For month: MM, e.g., "01", "1", "12"
For day: DD, e.g., "01", "1", "31"
For hours: HH, HH24 e.g., "00", "23" I.e., hours must be specified in 24 format
For minutes: MI, e.g., "01", "1", "59"
For seconds: SS, e.g., "01", "1", "59"

Example of a typical format string is: "DD/MM/YYYY HH:MI:SS

If no valid format pattern is found then an UnsupportedTimeStampFormat exception is thrown

createRTimestamp Source #

Arguments

:: String	Format string e.g., "DD/MM/YYYY HH24:MI:SS"
-> String	Timestamp string
-> RTimestamp

Creates an RTimestamp data type from an input timestamp format string and a timestamp value represented as a String. Valid format patterns are:

For year: YYYY, e.g., "0001", "2018"
For month: MM, e.g., "01", "1", "12"
For day: DD, e.g., "01", "1", "31"
For hours: HH, HH24 e.g., "00", "23" I.e., hours must be specified in 24 format
For minutes: MI, e.g., "01", "1", "59"
For seconds: SS, e.g., "01", "1", "59"

Example of a typical format string is: "DD/MM/YYYY HH:MI:SS

If no valid format pattern is found then an UnsupportedTimeStampFormat exception is thrown

toUTCTime :: RTimestamp -> UTCTime Source #

Convert an RTimestamp value to a Universal Time value (UTCTime)

fromUTCTime :: UTCTime -> RTimestamp Source #

Convert a Universal Time value (UTCTime) to an RTimestamp value

rTimestampToRText Source #

Arguments

:: String	Output format e.g., "DD/MM/YYYY HH24:MI:SS"
-> RTimestamp	Input RTimestamp
-> RDataType	Output RText

rTimeStampToText: converts an RTimestamp value to RText Valid input formats are:

1. "DD/MM/YYYY HH24:MI:SS"
2. "YYYYMMDD-HH24.MI.SS"
3. "YYYYMMDD"
4. "YYYYMM"
5. "YYYY"

stdTimestampFormat :: String Source #

Standard timestamp format. For example: "DDMMYYYY HH24:MI:SS"

stdDateFormat :: [Char] Source #

Standard date format

Character/Text

instrRText Source #

Arguments

:: RDataType	substring to search for
-> RDataType	string to be searched
-> Maybe Int	Position within input string of substr 1st character

Search for the first occurence of a substring within a RText string and return the 1st character position, or Nothing if the substring is not found, or if an non-text RDataType, is given as input.

instr Source #

Arguments

:: Eq a
=> [a]	substring to search for
-> [a]	string to be searched
-> Maybe Int	Position within input string of substr 1st character

Search for the first occurence of a substring within a String and return the 1st character position, or Nothing if the substring is not found.

instrText Source #

Arguments

:: Text	substring to search for
-> Text	string to be searched
-> Maybe Int	Position within input string of substr 1st character

Search for the first occurence of a substring within a Text string and return the 1st character position, or Nothing if the substring is not found.

rdtappend :: RDataType -> RDataType -> RDataType Source #

Concatenates two Text RDataTypes, in all other cases of RDataType it returns Null.

stripRText Source #

Arguments

:: RDataType	input string
-> RDataType

stripRText : O(n) Remove leading and trailing white space from a string. If the input RDataType is not an RText, then Null is returned

removeCharAroundRText :: Char -> RDataType -> RDataType Source #

Helper function to remove a character around (from both beginning and end) of an (RText t) value

isText :: RDataType -> Bool Source #

Returns True only if this is an RText

NULL-Related

nvlRTable Source #

Arguments

:: ColumnName	ColumnName key
-> RDataType	Default value
-> RTable	input RTable
-> RTable

It receives an RTable and a default value. It returns a new RTable which is identical to the source one but for each RTuple, for the specified column every Null value in every RTuple has been replaced by a default value If you pass an empty RTable, then it returns an empty RTable Throws a ColumnDoesNotExist exception, if the column does not exist

nvlRTuple Source #

Arguments

:: ColumnName	ColumnName key
-> RDataType	Default value in the case of Null column values
-> RTuple	input RTuple
-> RTuple	output RTuple

It receives an RTuple and a default value. It returns a new RTuple which is identical to the source one but every Null value in the specified colummn has been replaced by a default value

isNullRTuple :: RTuple -> Bool Source #

Returns True if the input RTuple is a Null RTuple, otherwise it returns False Note that a Null RTuple has all its values equal with Null but it still has columns. This is different from an empty RTuple, which is an RTuple withi no columns and no values whatsoever. See isRTupEmpty.

isNull :: RDataType -> Bool Source #

Use this function to compare an RDataType with the Null value because due to Null logic x == Null or x /= Null, will always return False. It returns True if input value is Null

isNotNull :: RDataType -> Bool Source #

Use this function to compare an RDataType with the Null value because deu to Null logic x == Null or x /= Null, will always return False. It returns True if input value is Not Null

nvl Source #

Arguments

:: RDataType	input value
-> RDataType	default value returned if input value is Null
-> RDataType	output value

Returns the 1st parameter if this is not Null, otherwise it returns the 2nd.

nvlColValue Source #

Arguments

:: ColumnName	ColumnName key
-> RDataType	value returned if original value is Null
-> RTuple	input RTuple
-> RDataType	output value

Returns the value of a specific column (specified by name) if this is not Null. If this value is Null, then it returns the 2nd parameter. If you pass an empty RTuple, then it returns Null. Throws a ColumnDoesNotExist exception, if this map contains no mapping for the key.

Access RTable

isRTabEmpty :: RTable -> Bool Source #

Test whether an RTable is empty

headRTup :: RTable -> RTuple Source #

Get the first RTuple from an RTable

limit Source #

Arguments

:: Int	number of N `RTuple`s to return
-> RTable	input `RTable`
-> RTable	output `RTable`

returns the N first RTuples of an RTable

isRTupEmpty :: RTuple -> Bool Source #

Test whether an RTuple is empty

getRTupColValue Source #

Arguments

:: ColumnName	ColumnName key
-> RTuple	Input RTuple
-> RDataType	Output value

getRTupColValue :: Returns the value of an RTuple column based on the ColumnName key if the column name is not found, then it returns Null. !!!Note that this might be confusing since there might be an existing column name with a Null value!!!

rtupLookup Source #

Arguments

:: ColumnName	ColumnName key
-> RTuple	Input RTuple
-> Maybe RDataType	Output value

Returns the value of an RTuple column based on the ColumnName key if the column name is not found, then it returns Nothing

rtupLookupDefault Source #

Arguments

:: RDataType	Default value to return in the case the column name does not exist in the RTuple
-> ColumnName	ColumnName key
-> RTuple	Input RTuple
-> RDataType	Output value

Returns the value of an RTuple column based on the ColumnName key if the column name is not found, then it returns a default value

(<!>) Source #

Arguments

:: RTuple	Input RTuple
-> ColumnName	ColumnName key
-> RDataType	Output value

Operator for getting a column value from an RTuple Throws a ColumnDoesNotExist exception, if this map contains no mapping for the key.

(<!!>) Source #

Arguments

:: RTuple	Input RTuple
-> ColumnName	ColumnName key
-> Maybe RDataType	Output value

Safe Operator for getting a column value from an RTuple if the column name is not found, then it returns Nothing

Conversions

rtableToList :: RTable -> [RTuple] Source #

Turns an RTable to a list of RTuples

concatRTab :: [RTable] -> RTable Source #

Concatenates a list of RTables to a single RTable. Essentially, it unions (see runUnion) all RTables of the list.

rtupleToList :: RTuple -> [(ColumnName, RDataType)] Source #

Turns an RTuple to a List

toListRDataType :: RTupleMData -> RTuple -> [RDataType] Source #

toListRDataType: returns a list of RDataType values of an RTuple, in the fixed column order of the RTuple

toText :: RDataType -> Maybe Text Source #

Return the Text out of an RDataType If a non-text RDataType is given then Nothing is returned.

fromText :: Text -> RDataType Source #

Return an RDataType from Text

Container Functions

rtabMap :: (RTuple -> RTuple) -> RTable -> RTable Source #

Map function over an RTable.

rtabFoldr' :: (RTuple -> RTable -> RTable) -> RTable -> RTable -> RTable Source #

This is a fold operation on a RTable that returns an RTable. It is similar with : foldr' :: (a -> b -> b) -> b -> Vector a -> b of Vector, which is an O(n) Right fold with a strict accumulator

rtabFoldl' :: (RTable -> RTuple -> RTable) -> RTable -> RTable -> RTable Source #

This is a fold operation on RTable that returns an RTable. It is similar with : foldl' :: (a -> b -> a) -> a -> Vector b -> a of Vector, which is an O(n) Left fold with a strict accumulator

rtupleMap :: (RDataType -> RDataType) -> RTuple -> RTuple Source #

O(n) Transform this RTuple by applying a function to every value

rtupleMapWithKey :: (ColumnName -> RDataType -> RDataType) -> RTuple -> RTuple Source #

O(n) Transform this RTuple by applying a function to every value

rdatatypeFoldr' :: (RTuple -> RDataType -> RDataType) -> RDataType -> RTable -> RDataType Source #

This is a fold operation on a RTable that returns an RDataType value. It is similar with : foldr' :: (a -> b -> b) -> b -> Vector a -> b of Vector, which is an O(n) Right fold with a strict accumulator

rdatatypeFoldl' :: (RDataType -> RTuple -> RDataType) -> RDataType -> RTable -> RDataType Source #

This is a fold operation on RTable that returns an RDataType value It is similar with : foldl' :: (a -> b -> a) -> a -> Vector b -> a of Vector, which is an O(n) Left fold with a strict accumulator

Modify RTable (DML)

insertAppendRTab :: RTuple -> RTable -> RTable Source #

O(n) append an RTuple to an RTable Please note that this is an immutable implementation of an RTable insert. This simply means that the insert operation returns a new RTable and does not affect the original RTable.

insertPrependRTab :: RTuple -> RTable -> RTable Source #

O(n) prepend an RTuple to an RTable Please note that this is an immutable implementation of an RTable insert. This simply means that the insert operation returns a new RTable and does not affect the original RTable.

insertRTabToRTab Source #

Arguments

:: RTable	Source `RTable` to be inserted
-> RTable	Target `RTable`
-> RTable	Final Result

Insert an RTable to an existing RTable. This is equivalent to an INSERT INTO SELECT caluse in SQL. We want to insert into an RTable the results of a "subquery", which in our case is materialized via the input RTable. Please note that this is an immutable implementation of an RTable insert. This simply means that the insert operation returns a new RTable and does not affect the original RTable. Also note that the source and target RTables should have the same structure. By "structure", we mean that the ColumnNames and the corresponding data types must match. Essentially what we record in the ColumnInfo must be the same for the two RTables. Otherwise a ConflictingRTableStructures exception will be thrown.

deleteRTab Source #

Arguments

:: RPredicate	Predicate specifying the `Rtuple`s that must be deleted
-> RTable	`RTable` that the deletion will be applied
-> RTable	Result `RTable`

Delete RTuples from an RTable based on an RPredicate. Please note that this is an immutable implementation of an RTable update. This simply means that the delete operation returns a new RTable. So, the original RTable remains unchanged and no deletion in-place takes place whatsoever. Moreover, if we have multiple threads deleting an RTable, due to immutability, each thread "sees" its own copy of the RTable and thus there is no need for locking the deleted RTuples, as happens in a common RDBMS.

updateRTab Source #

Arguments

:: [(ColumnName, RDataType)]	List of column names to be updated with the corresponding new values
-> RPredicate	An RTuple -> Bool function that specifies the RTuples to be updated
-> RTable	Input RTable
-> RTable	Output RTable

Update an RTable. The input includes a list of (ColumnName, new Value) pairs. Also a filter predicate is specified, in order to restrict the update only to those RTuples that fulfill the predicate. Please note that this is an immutable implementation of an RTable update. This simply means that the update operation returns a new RTable that includes all the RTuples of the original RTable, both the ones that have been updated and the others that have not. So, the original RTable remains unchanged and no update in-place takes place whatsoever. Moreover, if we have multiple threads updating an RTable, due to immutability, each thread "sees" its own copy of the RTable and thus there is no need for locking the updated RTuples, as happens in a common RDBMS.

upsertRTab Source #

Arguments

:: RTable	Source `RTable`, i.e., the equivalent to an SQL `USING` subclause
-> RUpsertPredicate	The `RTuple` matching predicate for the merge operation
-> [ColumnName]	List of column names to be updated with the corresponding new values coming from the source `RTuple`s that match with the target `RTuple`s based on the `RUpsertPredicate`
-> RPredicate	A filter that specifies the target `RTuple`s to be updated
-> RTable	The target `RTable`
-> RTable	Final Result

Upsert (Update+Insert, aka Merge) Operation. We provide a source RTable and a matching condition (RUpsertPredicate) to the RTuples of the target RTable. An RTuple from the target RTable might match to a single only RTuple in the source RTable, or not match at all. If it is matched to more than one RTuples then an exception (UniquenessViolationInUpsert) is thrown. When an RTuple from the target RTable is matched to a source RTuple, then the corresponding columns of the target RTuple are updated with the new values provided in the source RTuple. This takes place for the target RTuples that match but also that satisfy the input RPredicate. Thus we can restrict further with a filter the RTuples of the target RTable where the update will take place. Finally, the source RTuples that did not match to the target RTable, are inserted (appended) to the target RTable

Please note that this is an immutable implementation of an RTable upsert. This simply means that the upsert operation returns a new RTable and does not affect the original RTable. Moreover, if we have multiple threads updating an RTable, due to immutability, each thread "sees" its own copy of the RTable and thus there is no need for locking the updated RTuples, as happens in a common RDBMS.

Also note that the source and target RTables should have the same structure. By "structure", we mean that the ColumnNames and the corresponding data types must match. Essentially what we record in the ColumnInfo must be the same for the two RTables. Otherwise a ConflictingRTableStructures exception will be thrown.

 An Example:
 Source RTable: src = 
     Id  |   Msg         | Other
     ----|---------------|-------
     1   |   "hello2"    |"a"    
     2   |   "world2"    |"a"    
     3   |   "new"       |"a"    

 Target RTable: trg = 
     Id  |   Msg         | Other
     ----|---------------|-------
     1   |   "hello1"    |"b"    
     2   |   "world1"    |"b"    
     4   |   "old"       |"b"    
     5   |   "hello"     |"b"    

 >>> upsertRTab  src
                 RUpsertPredicate {matchCols = ["Id"], matchPred = \t1 t2 -> t1 <!> "Id" == t2 <!> "Id" }
                 ["Msg"]
                 (\t ->   let 
                             msg = case toText (t <!> "Msg") of
                                         Just t -> t
                                         Nothing -> pack ""
                         in (take 5 msg) == (pack "hello")
                 )  -- Msg like "hello%"
                 trg

 Result RTable: rslt = 
     Id  |   Msg         | Other
     ----|---------------|-------
     1   |   "hello2"    |"b"   (Note that only column "Msg" has been overwritten, as per the 3rd argument) 
     2   |   "world1"    |"b"    
     3   |   "new"       |"a"    
     4   |   "old"       |"b"    
     5   |   "hello"     |"b"

updateRTuple Source #

Arguments

:: ColumnName	key where the update will take place
-> RDataType	new value
-> RTuple	input RTuple
-> RTuple	output RTuple

Update an RTuple at a specific column specified by name with a value. If the ColumnName exists, then the value is updated with the input value. If the ColumnName does not exist, then a ColumnDoesNotExist exception is thrown.

upsertRTuple Source #

Arguments

:: ColumnName	key where the upsert will take place
-> RDataType	new value
-> RTuple	input RTuple
-> RTuple	output RTuple

Upsert (update/insert) an RTuple at a specific column specified by name with a value If the cname key is not found then the (columnName, value) pair is inserted. If it exists then the value is updated with the input value.

Create/Alter RTable (DDL)

emptyRTable :: RTable Source #

emptyRTable: Create an empty RTable

createSingletonRTable :: RTuple -> RTable Source #

Creates an RTable with a single RTuple

rtableFromList :: [RTuple] -> RTable Source #

Creates an RTable from a list of RTuples

addColumn Source #

Arguments

:: ColumnName	name of the column to be added
-> RDataType	Default value of the new column. All RTuples will initially have this value in this column
-> RTable	Input RTable
-> RTable	Output RTable

addColumn: adds a column to an RTable

removeColumn Source #

Arguments

:: ColumnName	Column to be removed
-> RTable	input RTable
-> RTable	output RTable

removeColumn : removes a column from an RTable. The column is specified by ColumnName. If this ColumnName does not exist in the RTuple of the input RTable then nothing is happened, the RTuple remains intact.

emptyRTuple :: RTuple Source #

Creates an empty RTuple (i.e., one with no column,value mappings)

createNullRTuple :: [ColumnName] -> RTuple Source #

Creates a Null RTuple based on a list of input Column Names. A Null RTuple is an RTuple where all column names correspond to a Null value (Null is a data constructor of RDataType)

createRTuple Source #

Arguments

:: [(ColumnName, RDataType)]	input list of (columnname,value) pairs
-> RTuple

createRTuple: Create an Rtuple from a list of column names and values

rtupleFromList :: [(ColumnName, RDataType)] -> RTuple Source #

Create an RTuple from a list

createRDataType Source #

Arguments

:: Typeable a
=> a	input value
-> RDataType	output RDataType

createRDataType: Get a value of type a and return the corresponding RDataType. The input value data type must be an instance of the Typepable typeclass from Data.Typeable

Metadata Functions

createRTableMData Source #

Arguments

:: (RTableName, [(ColumnName, ColumnDType)])
-> [ColumnName]	Primary Key. [] if no PK exists
-> [[ColumnName]]	list of unique keys. [] if no unique keys exists
-> RTableMData

createRTableMData : creates RTableMData from input given in the form of a list We assume that the column order of the input list defines the fixed column order of the RTuple.

getColumnNamesFromRTab :: RTable -> [ColumnName] Source #

Get the Column Names of an RTable

getColumnNamesFromRTuple :: RTuple -> [ColumnName] Source #

Returns the Column Names of an RTuple

getColumnInfoFromRTab :: RTable -> [ColumnInfo] Source #

getColumnInfoFromRTuple :: RTuple -> [ColumnInfo] Source #

getTheType :: RDataType -> ColumnDType Source #

Take a column value and return its type

listOfColInfoRDataType :: [ColumnInfo] -> RTuple -> [(ColumnInfo, RDataType)] Source #

Define equality for two ColumnInfo structures For two column two have "equal structure" they must have the same name and the same type. If one of the two (or both) have an UknownType, then they are still considered of equal structure.

Creates a list of the form [(ColumnInfo, RDataType)] from a list of ColumnInfo and an RTuple. The returned list respects the order of the [ColumnInfo]. It guarantees that RDataTypes will be in the same column order as [ColumnInfo], i.e., the correct RDataType for the correct column

toListColumnName :: RTupleMData -> [ColumnName] Source #

toListColumnName: returns a list of RTuple column names, in the fixed column order of the RTuple.

toListColumnInfo :: RTupleMData -> [ColumnInfo] Source #

toListColumnInfo: returns a list of RTuple columnInfo, in the fixed column order of the RTuple

rtabsSameStructure :: RTable -> RTable -> Bool Source #

Compares the structure of the input RTables and returns True if these are the same. By "structure", we mean that the ColumnNames and the corresponding data types must match. Essentially what we record in the ColumnInfo must be the same for the two RTables. Note that in the case of two columns having the same name but one of the two (or both) have a dtype equal to UknownType, then this function assumes that they are the same (i.e., equal ColumnInfos).

rtuplesSameStructure :: RTuple -> RTuple -> Bool Source #

Compares the structure of the input RTuples and returns True if these are the same. By "structure", we mean that the ColumnNames and the corresponding data types must match. Essentially what we record in the ColumnInfo must be the same for the two RTuples

getUniqueColumnNamesAfterJoin :: [ColumnName] -> [ColumnName] -> [ColumnName] Source #

Receives two lists of ColumnNames and returns the unique list of ColumnNames after concatenating the two and removing the names from the second one that are a prefix of the first one. This function is intended to dedublicate common columns after a join (see ij), where ColA for example, will also appear as ColA_1. This function DOES NOT dedublicate columns ColA and ColAsomeSuffix, only cases like this one ColName_Num (e.g., ColName_1, ColName_2, etc.) Here is an example:

>>> getUniqueColumnNames ["ColA","ColB"] ["ColC","ColA", "ColA_1", "ColA_2", "ColA_A", "ColA_hello", "ColAhello"]
>>> ["ColA","ColB","ColC","ColA_A","ColA_hello","ColAhello"]

Exceptions

data ColumnDoesNotExist Source #

This exception is thrown whenever we try to access a specific column (i.e., ColumnName) of an RTuple and the column does not exist.

Constructors

ColumnDoesNotExist ColumnName

Instances

Eq ColumnDoesNotExist Source #
Instance details Defined in RTable.Core Methods (==) :: ColumnDoesNotExist -> ColumnDoesNotExist -> Bool # (/=) :: ColumnDoesNotExist -> ColumnDoesNotExist -> Bool #
Show ColumnDoesNotExist Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> ColumnDoesNotExist -> ShowS # show :: ColumnDoesNotExist -> String # showList :: [ColumnDoesNotExist] -> ShowS #
Exception ColumnDoesNotExist Source #
Instance details Defined in RTable.Core Methods toException :: ColumnDoesNotExist -> SomeException # fromException :: SomeException -> Maybe ColumnDoesNotExist # displayException :: ColumnDoesNotExist -> String #

data ConflictingRTableStructures Source #

This exception means that we have tried to do some operation between two RTables, which requires that the structure of the two is the same. e.g., an Insert Into TAB RTuples, or a UNION or toher set operations. By "structure", we mean that the ColumnNames and the corresponding data types must match. Essentially what we record in the ColumnInfo must be the same for the two RTables

Constructors

ConflictingRTableStructures String

Error message indicating the operation that failed.

Instances

Eq ConflictingRTableStructures Source #
Instance details Defined in RTable.Core Methods (==) :: ConflictingRTableStructures -> ConflictingRTableStructures -> Bool # (/=) :: ConflictingRTableStructures -> ConflictingRTableStructures -> Bool #
Show ConflictingRTableStructures Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> ConflictingRTableStructures -> ShowS # show :: ConflictingRTableStructures -> String # showList :: [ConflictingRTableStructures] -> ShowS #
Exception ConflictingRTableStructures Source #
Instance details Defined in RTable.Core Methods toException :: ConflictingRTableStructures -> SomeException # fromException :: SomeException -> Maybe ConflictingRTableStructures # displayException :: ConflictingRTableStructures -> String #

data EmptyInputStringsInToRTimestamp Source #

Length mismatch between the format String and the input String data RTimestampFormatLengthMismatch = RTimestampFormatLengthMismatch String String deriving(Eq,Show) instance Exception RTimestampFormatLengthMismatch

One (or both) of the input Strings to function toRTimestamp are empty

Constructors

EmptyInputStringsInToRTimestamp String String

Instances

Eq EmptyInputStringsInToRTimestamp Source #
Instance details Defined in RTable.Core Methods (==) :: EmptyInputStringsInToRTimestamp -> EmptyInputStringsInToRTimestamp -> Bool # (/=) :: EmptyInputStringsInToRTimestamp -> EmptyInputStringsInToRTimestamp -> Bool #
Show EmptyInputStringsInToRTimestamp Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> EmptyInputStringsInToRTimestamp -> ShowS # show :: EmptyInputStringsInToRTimestamp -> String # showList :: [EmptyInputStringsInToRTimestamp] -> ShowS #
Exception EmptyInputStringsInToRTimestamp Source #
Instance details Defined in RTable.Core Methods toException :: EmptyInputStringsInToRTimestamp -> SomeException # fromException :: SomeException -> Maybe EmptyInputStringsInToRTimestamp # displayException :: EmptyInputStringsInToRTimestamp -> String #

data UnsupportedTimeStampFormat Source #

This exception is thrown whenever we provide a Timestamp format with not even one valid format pattern

Constructors

UnsupportedTimeStampFormat String

Instances

Eq UnsupportedTimeStampFormat Source #
Instance details Defined in RTable.Core Methods (==) :: UnsupportedTimeStampFormat -> UnsupportedTimeStampFormat -> Bool # (/=) :: UnsupportedTimeStampFormat -> UnsupportedTimeStampFormat -> Bool #
Show UnsupportedTimeStampFormat Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> UnsupportedTimeStampFormat -> ShowS # show :: UnsupportedTimeStampFormat -> String # showList :: [UnsupportedTimeStampFormat] -> ShowS #
Exception UnsupportedTimeStampFormat Source #
Instance details Defined in RTable.Core Methods toException :: UnsupportedTimeStampFormat -> SomeException # fromException :: SomeException -> Maybe UnsupportedTimeStampFormat # displayException :: UnsupportedTimeStampFormat -> String #

data UniquenessViolationInUpsert Source #

This exception means that we have tried an Upsert operation where the source RTable does not have a unique set of Rtuples if grouped by the columns used in the matching condition. This simply means that we cannot determine which of the dublicate RTuples in the source RTable will overwrite the target RTable, when the matching condition is satisfied.

Constructors

UniquenessViolationInUpsert String

Error message

Instances

Eq UniquenessViolationInUpsert Source #
Instance details Defined in RTable.Core Methods (==) :: UniquenessViolationInUpsert -> UniquenessViolationInUpsert -> Bool # (/=) :: UniquenessViolationInUpsert -> UniquenessViolationInUpsert -> Bool #
Show UniquenessViolationInUpsert Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> UniquenessViolationInUpsert -> ShowS # show :: UniquenessViolationInUpsert -> String # showList :: [UniquenessViolationInUpsert] -> ShowS #
Exception UniquenessViolationInUpsert Source #
Instance details Defined in RTable.Core Methods toException :: UniquenessViolationInUpsert -> SomeException # fromException :: SomeException -> Maybe UniquenessViolationInUpsert # displayException :: UniquenessViolationInUpsert -> String #

RTable IO Operations

RTable Printing and Formatting

An Example of RTable printing

>>> -- define a simple RTable from a list
>>> :set -XOverloadedStrings
>>> :{
let tab1 =  rtableFromList [   rtupleFromList [("ColInteger", RInt 1), ("ColDouble", RDouble 2.3), ("ColText", RText "We dig dig dig dig dig dig dig")]
                        ,rtupleFromList [("ColInteger", RInt 2), ("ColDouble", RDouble 5.36879), ("ColText", RText "From early morn to night")]
                        ,rtupleFromList [("ColInteger", RInt 3), ("ColDouble", RDouble 999.9999), ("ColText", RText "In a mine the whole day through")]
                        ,rtupleFromList [("ColInteger", RInt 4), ("ColDouble", RDouble 0.9999), ("ColText", RText "Is what we like to do")]
                      ]
:}                          
>>> -- print without format specification
>>> printRTable tab1

   -----------------------------------------------------------------
   ColInteger     ColText                             ColDouble
   ~~~~~~~~~~     ~~~~~~~                             ~~~~~~~~~
   1              We dig dig dig dig dig dig dig      2.30
   2              From early morn to night            5.37
   3              In a mine the whole day through     1000.00
   4              Is what we like to do               1.00

   4 rows returned
   -----------------------------------------------------------------

>>> -- print with format specification (define column printing order and value formatting per column)
>>> printfRTable (genRTupleFormat ["ColInteger","ColDouble","ColText"] $ genColFormatMap [("ColInteger", Format "%d"),("ColDouble", Format "%1.1e"),("ColText", Format "%50s\n")]) tab1

   -----------------------------------------------------------------
   ColInteger     ColDouble     ColText
   ~~~~~~~~~~     ~~~~~~~~~     ~~~~~~~
   1              2.3e0                             We dig dig dig dig dig dig dig

   2              5.4e0                                   From early morn to night

   3              1.0e3                            In a mine the whole day through

   4              1.0e0                                      Is what we like to do


   4 rows returned
   -----------------------------------------------------------------

printRTable :: RTable -> IO () Source #

printRTable : Print the input RTable on screen

eitherPrintRTable :: Exception e => (RTable -> IO ()) -> RTable -> IO (Either e ()) Source #

Safe printRTable alternative that returns an Either, so as to give the ability to handle exceptions gracefully, during the evaluation of the input RTable. Example:

do 
 p <- (eitherPrintRTable  printRTable myRTab) :: IO (Either SomeException ())
 case p of
           Left exc -> putStrLn $ "There was an error in the Julius evaluation: " ++ (show exc)
           Right _  -> return ()

printfRTable :: RTupleFormat -> RTable -> IO () Source #

prints an RTable with an RTuple format specification. It can be used instead of printRTable when one of the following two is required:

a) When we want to specify the order that the columns will be printed on screen
b) When we want to specify the formatting of the values by using a printf-like FormatSpecifier

eitherPrintfRTable :: Exception e => (RTupleFormat -> RTable -> IO ()) -> RTupleFormat -> RTable -> IO (Either e ()) Source #

Safe printRfTable alternative that returns an Either, so as to give the ability to handle exceptions gracefully, during the evaluation of the input RTable. Example:

do 
 p <- (eitherPrintfRTable printfRTable myFormat myRTab) :: IO (Either SomeException ())
 case p of
           Left exc -> putStrLn $ "There was an error in the Julius evaluation: " ++ (show exc)
           Right _  -> return ()

data RTupleFormat Source #

Basic data type for defining the desired formatting of an RTuple when printing an RTable (see printfRTable).

Constructors

RTupleFormat
Fields colSelectList :: [ColumnName] For defining the column ordering (i.e., the SELECT clause in SQL) colFormatMap :: ColFormatMap For defining the formating per Column in "`printf` style"

Instances

Eq RTupleFormat Source #
Instance details Defined in RTable.Core Methods (==) :: RTupleFormat -> RTupleFormat -> Bool # (/=) :: RTupleFormat -> RTupleFormat -> Bool #
Show RTupleFormat Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> RTupleFormat -> ShowS # show :: RTupleFormat -> String # showList :: [RTupleFormat] -> ShowS #

type ColFormatMap = HashMap ColumnName FormatSpecifier Source #

A map of ColumnName to Format Specification

data FormatSpecifier Source #

Format specifier of printf style

Constructors

DefaultFormat
Format String

Instances

Eq FormatSpecifier Source #
Instance details Defined in RTable.Core Methods (==) :: FormatSpecifier -> FormatSpecifier -> Bool # (/=) :: FormatSpecifier -> FormatSpecifier -> Bool #
Show FormatSpecifier Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> FormatSpecifier -> ShowS # show :: FormatSpecifier -> String # showList :: [FormatSpecifier] -> ShowS #

data OrderingSpec Source #

A sum type to help the specification of a column ordering (Ascending, or Descending)

Constructors

Asc
Desc

Instances

Eq OrderingSpec Source #
Instance details Defined in RTable.Core Methods (==) :: OrderingSpec -> OrderingSpec -> Bool # (/=) :: OrderingSpec -> OrderingSpec -> Bool #
Show OrderingSpec Source #
Instance details Defined in RTable.Core Methods showsPrec :: Int -> OrderingSpec -> ShowS # show :: OrderingSpec -> String # showList :: [OrderingSpec] -> ShowS #

genRTupleFormat Source #

Arguments

:: [ColumnName]	Column Select list
-> ColFormatMap	Column Format Map
-> RTupleFormat	Output

Generate an RTupleFormat data type instance

genRTupleFormatDefault :: RTupleFormat Source #

Generate a default RTupleFormat data type instance. In this case the returned column order (Select list), will be unspecified and dependant only by the underlying structure of the RTuple (HashMap)

genColFormatMap :: [(ColumnName, FormatSpecifier)] -> ColFormatMap Source #

Generates a Column Format Specification

genDefaultColFormatMap :: ColFormatMap Source #

Generates a default Column Format Specification