!      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~Safe1Implements the relational Table concept. Defines all necessary data types like RTable and RTuple as well as basic relational algebra operations on RTables.(c) Nikos Karagiannidis, 2018 BSD3 nkarag@gmail.com stable POSIX None"#16KU DBFunctor#Length mismatch between the format  and the input  data RTimestampFormatLengthMismatch = RTimestampFormatLengthMismatch String String deriving(Eq,Show) instance Exception RTimestampFormatLengthMismatchOne (or both) of the input s to function  are empty DBFunctorgThis exception is thrown whenever we provide a Timestamp format with not even one valid format pattern DBFunctorLThis exception is thrown whenever we try to access a specific column (i.e., b) of an d! and the column does not exist.  DBFunctorFormat specifier of  style  DBFunctor+A map of ColumnName to Format Specification  DBFunctor:Basic data type for defining the desired formatting of an d when printing an RTable (see ). DBFunctorEFor defining the column ordering (i.e., the SELECT clause in SQL)  DBFunctor*For defining the formating per Column in " style" DBFunctorRTabResult is the result of an RTable operation and is a Writer Monad, that includes the new RTable, as well as the number of RTuples returned by the operation. DBFunctor1Number of RTuples returned by an RTable operation DBFunctor_Aggregation Function type. An aggregation function receives as input a source column (i.e., a b) of a source em and returns an aggregated value, which is the result of the aggregation on the values of the source column. DBFunctorRThis data type represents all possible aggregate operations over an RTable. Examples are : Sum, Count, Average, Min, Max but it can be any other "aggregation". The essential property of an aggregate operation is that it acts on an RTable (or on a group of RTuples - in the case of the RGroupBy operation) and produces a single RTuple.rAn aggregate operation is applied on a specific column (source column) and the aggregated result will be stored in the target column. It is important to understand that the produced aggregated RTuple is different from the input RTuples. It is a totally new RTuple, that will consist of the aggregated column(s) (and the grouping columns in the case of an RGroupBy). DBFunctor Source column DBFunctor Target column DBFunctor?here we define the aggegate function to be applied on an RTable DBFunctor9The Group By Predicate It defines the condition for two d#s to be included in the same group. DBFunctor(The Join Predicate. It defines when two ds should be paired. DBFunctor$A generic binary operation on RTable DBFunctor%A generic unary operation on a RTable DBFunctorTA sum type to help the specification of a column ordering (Ascending, or Descending) DBFunctor\Definition of Relational Algebra operations. These are the valid operations between RTables! DBFunctorUnion " DBFunctor Intersection# DBFunctor Difference$ DBFunctor Projection% DBFunctorFilter operation (an 7' can be any function of the signature  RTuple -> Bool z so it is much more powerful than a typical SQL filter expression, which is a boolean expression of comparison operators)& DBFunctor\Inner Join (any type of join predicate allowed. Any function with a signature of the form:  RTuple -> RTuple -> Bool < is a valid join predicate. I.e., a function which returns  when two RTuples must be paired)' DBFunctorLeft Outer Join ( DBFunctorRight Outer Join ) DBFunctorRPerforms aggregation operations on specific columns and returns a singleton RTable* DBFunctor/A Group By operation The SQL equivalent is: < SELECT colGrByList, aggList FROM... GROUP BY colGrByList \ Note that compared to SQL, we can have a more generic grouping predicate (i.e., when two dhs should belong in the same group) than just the equality of values on the common columns between two ds. Also note, that in the case of an aggregation without grouping (equivalent to a single-group group by), then the grouping predicate should be:  _ _ -> True + DBFunctorA combination of unary  s e.g., / (p plist).(f pred) (i.e., RPrj . RFilter)  , in the form of an  RTable -> RTable function. k In this sense we can also include a binary operation (e.g. join), if we partially apply the join to one e, e.g., '(ij jpred rtab) . (p plist) . (f pred) , DBFunctorA generic binary .- DBFunctor Order the d s of the e{ acocrding to the specified list of Columns. First column in the input list has the highest priority in the sorting order.1 DBFunctorlist of aggregates 2 DBFunctorthe grouping predicate3 DBFunctorthe Group By list of columns7 DBFunctorFA Predicate. It defines an arbitrary condition over the columns of an d%. It is used primarily in the filter %+ operation and used in the filter function .8 DBFunctor(Basic metadata for a column of an RTuple< DBFunctorBasic Metadata of an d . The d! metadata are accessed through a  b 8* structure. I.e., for each column of the d, we access the 8D structure to get Column-level metadata. This access is achieved by bZ. However, in order to provide the "impression" of a fixed column order per tuple (see d! definition), we provide another  , the   b8. So in the follwoing example, if we want to access the <h tupmdata ColumnInfo by column order, (assuming that we have N columns) we have to do the following:  (snd tupmdata)!((fst tupmdata)!0) (snd tupmdata)!((fst tupmdata)!1) ... (snd tupmdata)!((fst tupmdata)!(N-1)) 7In the same manner in order to access the column of an d2 (e.g., tup) by column order, we do the following: a tup!((fst tupmdata)!0) tup!((fst tupmdata)!1) ... tup!((fst tupmdata)!(N-1)) = DBFunctorMetadata for an RTable? DBFunctor Name of the e@ DBFunctor8Tuple-level metadata other metadataA DBFunctor Primary KeyB DBFunctorIList of unique keys i.e., each sublist is a unique key column combinationC DBFunctormBasic data type to represent time. This is a strict data type, meaning whenever we evaluate a value of type C<, there must be also evaluated all the fields it contains.N DBFunctor[Definition of the Relational Data Type. This is the data type of the values stored in each eL. This is a strict data type, meaning whenever we evaluate a value of type N<, there must be also evaluated all the fields it contains.X DBFunctore.g., "DD/MM/YYYY"[ DBFunctor-This is used only for metadata purposes (see 8). The actual data type of a value is an RDataType The Text component of Date and Timestamp data constructors is the date format e.g., "DD/MM/YYYY", "DD/MM/YYYY HH24:MI:SS"a DBFunctorDefinition of the Table Nameb DBFunctorDefinition of the Column Namec DBFunctorDefinition of the Name typed DBFunctor*Definition of the Relational Tuple. An d is implemented as a  of (b, Ni) pairs. This ensures fast access of the column value by column name. Note that this implies that the dR CANNOT have more than one columns with the same name (i.e. hashmap key) and more importantly that it DOES NOT have a fixed order of columns, as it is usual in RDBMS implementations. This gives us the freedom to perform column change operations very fast. The only place were we need fixed column order is when we try to load an e from a fixed-column structure such as a CSV file. For this reason, we have embedded the notion of a fixed column-order in the d metadata. See <.e DBFunctor0Definition of the Relational Table entity An e is a "container" of ds.f DBFunctor@Basic class to represent a data type that can be turned into an e/. It implements the concept of "tabular data" i DBFunctor Turns an e to a list of dsj DBFunctor(Creates an RTable from a list of RTuplesk DBFunctorTurns an RTuple to a Listl DBFunctorCreate an RTuple from a listm DBFunctorUse this function to compare an RDataType with the Null value because due to Null logic x == Null or x /= Null, will always return False. It returns True if input value is Nulln DBFunctorUse this function to compare an RDataType with the Null value because deu to Null logic x == Null or x /= Null, will always return False. It returns True if input value is Not Nullo DBFunctorStandard date formatp DBFunctor!Get the Column Names of an RTableq DBFunctor#Get the first RTuple from an RTabler DBFunctor%Returns the Column Names of an RTuples DBFunctor}Returns the value of an RTuple column based on the ColumnName key if the column name is not found, then it returns Nothingt DBFunctorReturns the value of an RTuple column based on the ColumnName key if the column name is not found, then it returns a default valueu DBFunctorgetRTupColValue :: Returns the value of an RTuple column based on the ColumnName key if the column name is not found, then it returns Null. !!!Note that this might be confusing since there might be an existing column name with a Null value!!!v DBFunctor?Operator for getting a column value from an RTuple Throws a 8 exception, if this map contains no mapping for the key.w DBFunctorsSafe Operator for getting a column value from an RTuple if the column name is not found, then it returns Nothingx DBFunctorMReturns the 1st parameter if this is not Null, otherwise it returns the 2nd. y DBFunctorReturns the value of a specific column (specified by name) if this is not Null. If this value is Null, then it returns the 2nd parameter. If you pass an empty RTuple, then it returns Null. Throws a 8 exception, if this map contains no mapping for the key.z DBFunctorIt receives an RTuple and lookups the value at a specfic column name. Then it compares this value with the specified search value. If it is equal to the search value then it returns the specified Return Value. If not, then it returns the specified default Value, if the ignore indicator is not set, otherwise (if the ignore indicator is set) it returns the existing value. If you pass an empty RTuple, then it returns Null. Throws a 8 exception, if this map contains no mapping for the key.{ DBFunctorIt receives an RTuple and a default value. It returns a new RTuple which is identical to the source one but every Null value in the specified colummn has been replaced by a default value| DBFunctor%It receives an RTable and a default value. It returns a new RTable which is identical to the source one but for each RTuple, for the specified column every Null value in every RTuple has been replaced by a default value If you pass an empty RTable, then it returns an empty RTable Throws a ( exception, if the column does not exist} DBFunctorIt receives an RTable, a search value and a default value. It returns a new RTable which is identical to the source one but for each RTuple, for the specified column: if the search value was found then the specified Return Value is returned else the default value is returned (if the ignore indicator is not set), otherwise (if the ignore indicator is set), it returns the existing value for the column for each dK. If you pass an empty RTable, then it returns an empty RTable Throws a ( exception, if the column does not exist~ DBFunctorUpsert (update/insert) an RTuple at a specific column specified by name with a value If the cname key is not found then the (columnName, value) pair is inserted. If it exists then the value is updated with the input value. DBFunctorstripRText : O(n) Remove leading and trailing white space from a string. If the input RDataType is not an RText, then Null is returned DBFunctorConcatenates two Text  RDataTypes, in all other cases of N it returns T. DBFunctor`Helper function to remove a character around (from both beginning and end) of an (RText t) value DBFunctor Returns an C from an input  and a format .Valid format patterns are: For year: YYYY, e.g., "0001", "2018" For month: MM, e.g., "01", "1", "12" For day: DD , e.g., "01", "1", "31" For hours: HH, HH24 e.g., "00", "23"+ I.e., hours must be specified in 24 format For minutes: MI, e.g., "01", "1", "59" For seconds: SS, e.g., "01", "1", "59"'Example of a typical format string is: "DD/MM/YYYY HH:MI:SS,If no valid format pattern is found then an  exception is thrown DBFunctor7Search for the first occurence of a substring within a , and return the 1st character position, or  if the substring is not found. DBFunctor7Search for the first occurence of a substring within a 3 string and return the 1st character position, or  if the substring is not found. DBFunctor7Search for the first occurence of a substring within a P3 string and return the 1st character position, or 2 if the substring is not found, or if an non-text N, is given as input. DBFunctormCreates an RTimestamp data type from an input timestamp format string and a timestamp value represented as a . Valid format patterns are: For year: YYYY, e.g., "0001", "2018" For month: MM, e.g., "01", "1", "12" For day: DD , e.g., "01", "1", "31" For hours: HH, HH24 e.g., "00", "23"+ I.e., hours must be specified in 24 format For minutes: MI, e.g., "01", "1", "59" For seconds: SS, e.g., "01", "1", "59"'Example of a typical format string is: "DD/MM/YYYY HH:MI:SS,If no valid format pattern is found then an  exception is thrown DBFunctor_Return the Text out of an RDataType If a non-text RDataType is given then Nothing is returned. DBFunctor Return an N from  DBFunctorReturns  only if this is an P DBFunctor+Standard timestamp format. For example: "DDMMYYYY HH24:MI:SS" DBFunctorQrTimeStampToText: converts an RTimestamp value to RText Valid input formats are:1.  "DD/MM/YYYY HH24:MI:SS" 2.  "YYYYMMDD-HH24.MI.SS" 3.  "YYYYMMDD" 4.  "YYYYMM" 5.  "YYYY"  DBFunctorcreateRTableMData : creates RTableMData from input given in the form of a list We assume that the column order of the input list defines the fixed column order of the RTuple. DBFunctorcreateRTupleMdata : Creates an RTupleMData instance based on a list of (Column name, Column Data type) pairs. The order in the input list defines the fixed column order of the RTuple DBFunctoratoListColumnName: returns a list of RTuple column names, in the fixed column order of the RTuple. DBFunctor^toListColumnInfo: returns a list of RTuple columnInfo, in the fixed column order of the RTuple DBFunctoritoListRDataType: returns a list of RDataType values of an RTuple, in the fixed column order of the RTuple DBFunctor9Creates a list of the form [(ColumnInfo, RDataType)] from a list of ColumnInfo and an RTuple. The returned list respects the order of the [ColumnInfo] Prelude.zip listOfColInfo (Prelude.map (snd) $ HM.toList rtup) -- this code does NOT guarantee that HM.toList will return the same column order as [ColumnInfo] DBFunctorcreateRDataType: Get a value of type a and return the corresponding RDataType. The input value data type must be an instance of the Typepable typeclass from Data.Typeable DBFunctor Returns an 5 with a custom aggregation function provided as input DBFunctorThe Sum aggregate operation DBFunctorWA helper function in raggSum that implements the basic fold for sum aggregation  DBFunctorThe Count aggregate operation DBFunctor[A helper function in raggCount that implements the basic fold for Count aggregation  DBFunctorThe Average aggregate operation DBFunctorThe Max aggregate operation DBFunctorWA helper function in raggMax that implements the basic fold for Max aggregation  DBFunctorThe Min aggregate operation DBFunctorWA helper function in raggMin that implements the basic fold for Min aggregation  DBFunctor@ropU operator executes a unary ROperation. A short name for the  function DBFunctorExecute a Unary ROperation DBFunctorCropUres operator executes a unary ROperation. A short name for the  function DBFunctor)Execute a Unary ROperation and return an  DBFunctorAropB operator executes a binary ROperation. A short name for the  function DBFunctorExecute a Binary ROperation DBFunctorDropBres operator executes a binary ROperation. A short name for the  function DBFunctor*Execute a Binary ROperation and return an  DBFunctorTest whether an RTable is empty DBFunctorTest whether an RTuple is empty DBFunctor#emptyRTable: Create an empty RTable DBFunctorACreates an empty RTuple (i.e., one with no column,value mappings) DBFunctor&Creates an RTable with a single RTuple DBFunctorEcreateRTuple: Create an Rtuple from a list of column names and values DBFunctorCreates a Null d+ based on a list of input Column Names. A T d is an d( where all column names correspond to a T value (T is a data constructor of N) DBFunctorMReturns True if the input RTuple is a Null RTuple, otherwise it returns False DBFunctorThis is a fold operation on a e that returns an e. It is similar with : 3 foldr' :: (a -> b -> b) -> b -> Vector a -> b B of Vector, which is an O(n) Right fold with a strict accumulator DBFunctorThis is a fold operation on a e that returns an N value. It is similar with : 3 foldr' :: (a -> b -> b) -> b -> Vector a -> b B of Vector, which is an O(n) Right fold with a strict accumulator DBFunctorThis is a fold operation on e that returns an e. It is similar with : 2 foldl' :: (a -> b -> a) -> a -> Vector b -> a A of Vector, which is an O(n) Left fold with a strict accumulator DBFunctorThis is a fold operation on e that returns an N value It is similar with : 2 foldl' :: (a -> b -> a) -> a -> Vector b -> a A of Vector, which is an O(n) Left fold with a strict accumulator DBFunctorMap function over an RTable DBFunctorCreates an RTuplesRet type DBFunctor6Return the number embedded in the RTuplesRet data type DBFunctorlCreates an RTabResult (i.e., a Writer Monad) from a result RTable and the number of RTuples that it returned DBFunctor8Returns the info "stored" in the RTabResult Writer Monad DBFunctoraReturns the "log message" in the RTabResult Writer Monad, which is the number of returned RTuples DBFunctorremoveColumn : removes a column from an RTable. The column is specified by ColumnName. If this ColumnName does not exist in the RTuple of the input RTable then nothing is happened, the RTuple remains intact. DBFunctor%addColumn: adds a column to an RTable DBFunctor7Filter (i.e. selection operator). A short name for the  runRFilter function DBFunctorExecutes an RFilter operation DBFunctor1RTable Projection operator. A short name for the  function DBFunctorkImplements RTable projection operation. If a column name does not exist, then an empty RTable is returned. DBFunctorImplements RTable projection operation. If a column name does not exist, then the returned RTable includes this column with a Null value. This projection implementation allows missed hits. DBFunctorreturns the N first ds of an e DBFunctor1RTable Inner Join Operator. A short name for the  function DBFunctorImplements an Inner Join operation between two RTables (any type of join predicate is allowed) Note that this operation is implemented as a  union, which means "the first Map (i.e., the left RTuple) will be prefered when dublicate keys encountered with different values. That is, in the context of joining two RTuples the value of the first (i.e., left) RTuple on the common key will be prefered. DBFunctor#Implements an Inner Join operation between two RTables (any type of join predicate is allowed) This Inner Join implementation follows Oracle DB's convention for common column names. When we have two tuples t1 and t2 with a common column name (lets say "Common"), then the resulting tuple after a join will be "Common", "Common_1", so a "_1" suffix is appended. The tuple from the left table by convention retains the original column name. So "Column_1" is the column from the right table. If "Column_1" already exists, then "Column_2" is used. DBFunctorJoins two RTuples into one. In this join we follow Oracle DB's convention when joining two tuples with some common column names. When we have two tuples t1 and t2 with a common column name (lets say Common2), then the resulitng tuple after a join will be Common, Common_1r, so a "_1" suffix is appended. The tuple from the left table by convention retains the original column name. So Column_1) is the column from the right table. If Column_1 already exists, then Column_2 is used. DBFunctor6RTable Left Outer Join Operator. A short name for the  function DBFunctorImplements a Left Outer Join operation between two RTables (any type of join predicate is allowed), i.e., the rows of the left RTable will be preserved. Note that when dublicate keys encountered that is, since the underlying structure for an RTuple is a Data.HashMap.Strict, only one value per key is allowed. So in the context of joining two RTuples the value of the left RTuple on the common key will be prefered.Implements a Left Outer Join operation between two RTables (any type of join predicate is allowed), i.e., the rows of the left RTable will be preserved. A Left Join : + tabLeft LEFT JOIN tabRight ON joinPred i where tabLeft is the preserving table can be defined as: the Union between the following two RTables:EThe result of the inner join: tabLeft INNER JOIN tabRight ON joinPredThe rows from the preserving table (tabLeft) that DONT satisfy the join condition, enhanced with the columns of tabRight returning Null values.rThe common columns will appear from both tables but only the left table column's will retain their original name.  DBFunctor7RTable Right Outer Join Operator. A short name for the  function DBFunctorImplements a Right Outer Join operation between two RTables (any type of join predicate is allowed), i.e., the rows of the right RTable will be preserved. A Right Join : , tabLeft RIGHT JOIN tabRight ON joinPred j where tabRight is the preserving table can be defined as: the Union between the following two RTables:EThe result of the inner join: tabLeft INNER JOIN tabRight ON joinPredThe rows from the preserving table (tabRight) that DONT satisfy the join condition, enhanced with the columns of tabLeft returning Null values.sThe common columns will appear from both tables but only the right table column's will retain their original name.  DBFunctorImplements a Right Outer Join operation between two RTables (any type of join predicate is allowed) i.e., the rows of the right RTable will be preserved. Note that when dublicate keys encountered that is, since the underlying structure for an RTuple is a Data.HashMap.Strict, only one value per key is allowed. So in the context of joining two RTuples the value of the right RTuple on the common key will be prefered.6RTable Full Outer Join Operator. A short name for the  function DBFunctor:Implements a Full Outer Join operation between two RTables (any type of join predicate is allowed) A full outer join is the union of the left and right outer joins respectively. The common columns will appear from both tables but only the left table column's will retain their original name (just by convention). DBFunctor,RTable Union Operator. A short name for the  function DBFunctorAImplements the union of two RTables as a union of two lists (see  ). Duplicates, and elements of the first list, are removed from the the second list, but if the first list contains duplicates, so will the resultAImplements the union of two RTables as a union of two lists (see  ). DBFunctor3RTable Intersection Operator. A short name for the  function DBFunctor+Implements the intersection of two RTables  DBFunctor1RTable Difference Operator. A short name for the  function DBFunctorKImplements the set Difference of two RTables as the diff of two lists (see  ). DBFunctor+Aggregation Operator. A short name for the  function DBFunctorImplements the aggregation operation on an RTable It aggregates the specific columns in each AggOperation and returns a singleton RTable i.e., an RTable with a single RTuple that includes only the agg columns and their aggregated value. DBFunctor(Order By Operator. A short name for the  function DBFunctor4Implements the ORDER BY operation. First column in the input list has the highest priority in the sorting order We treat Null as the maximum value (anything compared to Null is smaller). This way Nulls are send at the end (i.e., "Nulls Last" in SQL parlance). This is for Asc ordering. For Desc ordering, we have the opposite. Nulls go first and so anything compared to Null is greater. @ SQL example with q as (select case when level < 4 then level else NULL end c1 -- , level c2 from dual connect by level < 7 ) select * from q order by c1>C1 ---- 1 2 3 Null Null Nullwith q as (select case when level < 4 then level else NULL end c1 -- , level c2 from dual connect by level < 7 ) select * from q order by c1 desc DBFunctor(Group By Operator. A short name for the  function DBFunctor'Implement a grouping operation over an eL. No aggregation takes place. It returns the individual groups as separate e)s in a list. In total the initial set of ds is retained. If an empty e; is provided as input, then a ["empty RTable"] is returned. DBFunctorConcatenates a list of e2s to a single RTable. Essentially, it unions (see ) all es of the list. DBFunctor'Implement a grouping operation over an e*. No aggregation takes place. The output e has exactly the same d[s, as the input, but these are grouped based on the input grouping predicate. If an empty e% is provided as input, then an empty e is returned. DBFunctor*Implements the GROUP BY operation over an e.  DBFunctor;Helper function to returned a fixed Ordering Specification  from a list of bs DBFunctorA short name for the  function DBFunctorrunCombinedROp: A Higher Order function that accepts as input a combination of unary ROperations e.g., (p plist).(f pred) expressed in the form of a function (RTable -> Rtable) and applies this function to the input RTable. In this sense we can also include a binary operation (e.g. join), if we partially apply the join to one RTable e.g., (ij jpred rtab) . (p plist) . (f pred) DBFunctor"O(n) append an RTuple to an RTable DBFunctor#O(n) prepend an RTuple to an RTable DBFunctorUpdate an RTable. Input includes a list of (ColumnName, new Value) pairs. Also a filter predicate is specified in order to restrict the update only to those rtuples that fulfill the predicate DBFunctor/Generates a default Column Format Specification DBFunctor'Generates a Column Format Specification DBFunctor+Generate an RTupleFormat data type instance DBFunctorGenerate a default RTupleFormat data type instance. In this case the returned column order (Select list), will be unspecified and dependant only by the underlying structure of the d () DBFunctorSafe  printRfTable alternative that returns an r, so as to give the ability to handle exceptions gracefully, during the evaluation of the input RTable. Example: do p <- (eitherPrintfRTable printfRTable myFormat myRTab) :: IO (Either SomeException ()) case p of Left exc -> putStrLn $ "There was an error in the Julius evaluation: " ++ (show exc) Right _ -> return ()  DBFunctorQprints an RTable with an RTuple format specification. It can be used instead of + when one of the following two is required:Oa) When we want to specify the order that the columns will be printed on screenCb) When we want to specify the formatting of the values by using a -like  DBFunctorSafe  alternative that returns an r, so as to give the ability to handle exceptions gracefully, during the evaluation of the input RTable. Example: do p <- (eitherPrintRTable printRTable myRTab) :: IO (Either SomeException ()) case p of Left exc -> putStrLn $ "There was an error in the Julius evaluation: " ++ (show exc) Right _ -> return ()  DBFunctor.printRTable : Print the input RTable on screen DBFunctorReturns the max length of the String representation of each value, for each column of the input RTable. It returns the lengths in the column order specified by the input RTupleFormat parameter DBFunctorhReturns the max length of the String representation of each value, for each column of the input RTable.  DBFunctoruhelper function in order to format the value of a column It will append at the end of the string n number of spaces. DBFunctoruhelper function in order to format the value of a column It will append at the end of the string n number of spaces. DBFunctorhelper function that prints a continuous line adjusted to the size of the input RTable The column order is specified by the input RTupleFormat parameter DBFunctorVhelper function that prints a continuous line adjusted to the size of the input RTable DBFunctorPrints the input RTable's header (i.e., column names) on screen. The column order is specified by the corresponding RTupleFormat parameter. DBFunctorSprintRTableHeader : Prints the input RTable's header (i.e., column names) on screen DBFunctorAPrints an RTuple on screen (only the values of the columns) [Int] is a List of width per column to be used in the box definition The column order as well as the formatting specifications are specified by the first parameter. We assume that the order in [Int] corresponds to that of the RTupleFormat parameter. DBFunctorPrints an RTuple on screen (only the values of the columns) [Int] is a List of width per column to be used in the box definition  DBFunctornTurn the value stored in a RDataType into a String in order to be able to print it wrt to the specified format DBFunctorTurn the value stored in a RDataType into a String in order to be able to print it Values are transformed with a default formatting.  DBFunctorCIn order to be able to force full evaluation up to Normal Form (NF) DBFunctor-In order to be able to use (/) with RDataType DBFunctor|We need to explicitly specify equation of RDataType due to SQL NULL logic (i.e., anything compared to NULL returns false): R Null == _ = False, _ == Null = False, Null /= _ = False, _ /= Null = False. i IMPORTANT NOTE: Of course this means that anywhere in your code where you have something like this:  x == Null or x /= Null, a will always return False and thus it is futile to do this comparison. You have to use the is m function instead. DBFunctorEIn order to be able to force full evaluation up to Normal Form (NF) <https://www.fpcomplete.com/blog/2017/09/all-about-strictness4s DBFunctorColumnName key DBFunctor Input RTuple DBFunctor Output valuet DBFunctorPDefault value to return in the case the column name does not exist in the RTuple DBFunctorColumnName key DBFunctor Input RTuple DBFunctor Output valueu DBFunctorColumnName key DBFunctor Input RTuple DBFunctor Output valuev DBFunctor Input RTuple DBFunctorColumnName key DBFunctor Output valuew DBFunctor Input RTuple DBFunctorColumnName key DBFunctor Output valuex DBFunctor input value DBFunctor-default value returned if input value is Null DBFunctor output valuey DBFunctorColumnName key DBFunctor(value returned if original value is Null DBFunctor input RTuple DBFunctor output valuez DBFunctorColumnName key DBFunctor Search value DBFunctor Return value DBFunctorDefault value  DBFunctorIgnore default indicator  DBFunctor input RTuple{ DBFunctorColumnName key DBFunctor/Default value in the case of Null column values DBFunctorinput RTuple  DBFunctor output RTuple| DBFunctorColumnName key DBFunctorDefault value  DBFunctor input RTable} DBFunctorColumnName key DBFunctor Search value DBFunctor Return value DBFunctorDefault value  DBFunctorIgnore default indicator  DBFunctor input RTable~ DBFunctor#key where the upset will take place DBFunctor new value DBFunctor input RTuple DBFunctor output RTuple DBFunctor input string DBFunctor)Format string e.g., "DD/MM/YYYY HH:MI:SS" DBFunctorTimestamp string DBFunctorsubstring to search for DBFunctorstring to be searched DBFunctor5Position within input string of substr 1st character  DBFunctorsubstring to search for DBFunctorstring to be searched DBFunctor5Position within input string of substr 1st character  DBFunctorsubstring to search for DBFunctorstring to be searched DBFunctor5Position within input string of substr 1st character  DBFunctor+Format string e.g., "DD/MM/YYYY HH24:MI:SS" DBFunctorTimestamp string DBFunctor+Output format e.g., "DD/MM/YYYY HH24:MI:SS" DBFunctorInput RTimestamp  DBFunctor Output RText DBFunctorPrimary Key. [] if no PK exists DBFunctor0list of unique keys. [] if no unique keys exists DBFunctor input value DBFunctoroutput RDataType DBFunctorcustom aggregation function  DBFunctor source column DBFunctor target column DBFunctor source column DBFunctor target column DBFunctor source column DBFunctor target column DBFunctor source column DBFunctor target column DBFunctor source column DBFunctor target column DBFunctor source column DBFunctor target column DBFunctorinput ROperation DBFunctor input RTable DBFunctor output RTable DBFunctorinput ROperation DBFunctor input RTable DBFunctoroutput: Result of operation DBFunctorinput ROperation DBFunctor input RTable1 DBFunctorinput RTable2  DBFunctor output RTabl DBFunctorinput ROperation DBFunctor input RTable1 DBFunctorinput RTable2  DBFunctoroutput: Result of operation DBFunctor&input list of (columnname,value) pairs DBFunctor input pair  DBFunctoroutput Writer Monad DBFunctorColumn to be removed DBFunctor input RTable  DBFunctoroutput RTable  DBFunctorname of the column to be added DBFunctorZDefault value of the new column. All RTuples will initially have this value in this column DBFunctor Input RTable DBFunctor Output RTable DBFunctor>list of column names to be included in the final result RTable DBFunctor>list of column names to be included in the final result RTable DBFunctor number of N d s to return DBFunctorinput e DBFunctoroutput e DBFunctorInput Aggregate Operations DBFunctor Input RTable DBFunctorOutput singleton RTable DBFunctorInput ordering specification DBFunctor Input RTable DBFunctor Output RTable DBFunctor3Grouping predicate, in order to form the groups of ds (it defines when two d's should be included in the same group) DBFunctorList of grouping column names (GROUP BY clause in SQL) We assume that all RTuples in the same group have the same value in these columns DBFunctorinput e DBFunctoroutput list of e's where each one corresponds to a group DBFunctor3Grouping predicate, in order to form the groups of ds (it defines when two d's should be included in the same group) DBFunctorMList of grouping column names (GROUP BY clause in SQL) We assume that all d8s in the same group have the same value in these columns DBFunctorinput e DBFunctoroutput e DBFunctor}Grouping predicate, in order to form the groups of RTuples (it defines when two RTuples should be included in the same group) DBFunctor.Aggregations to be applied on specific columns DBFunctorList of grouping column names (GROUP BY clause in SQL) We assume that all RTuples in the same group have the same value in these columns DBFunctor input RTable DBFunctor output RTable DBFunctorinput combined RTable operation DBFunctor7input RTable that the input function will be applied to DBFunctor output RTable DBFunctorDList of column names to be updated with the corresponding new values DBFunctorCAn RTuple -> Bool function that specifies the RTuples to be updated DBFunctor Input RTable DBFunctor Output RTable DBFunctorColumn Select list  DBFunctorColumn Format Map DBFunctorOutput DBFunctornumber of spaces to add DBFunctor input String DBFunctor output string DBFunctornumber of spaces to add DBFunctorcharacter to add DBFunctor input String DBFunctor output string DBFunctor&Specifies the appropriate column order DBFunctor;a List of width per column to be used in the box definition DBFunctor*the char with which the line will be drawn DBFunctor;a List of width per column to be used in the box definition DBFunctor*the char with which the line will be drawn DBFunctorSpecifies Column order DBFunctor;a List of width per column to be used in the box definition DBFunctor;a List of width per column to be used in the box definition  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNTOPQRSUVWXYZ[`\]^_abcdefghijklmnopqrstuvwxyz{|}~edNTOPQRSUVWXYZCDEFGHIJ=>?@AB<89:;cba[`\]^_fgh !"#$%&'()*+,-./01234567KLM}zo|{mnxyqustvwik~jlpr   'Implements ETL operations over RTables.(c) Nikos Karagiannidis, 2018 BSD3 nkarag@gmail.com stable POSIX None&! DBFunctorETLmapping : it is the equivalent of a mapping in an ETL tool and consists of a series of ETLOperations that are applied, one-by-one, to some initial input RTable, but if binary ETLOperations are included in the ETLMapping, then there will be more than one input RTables that the ETLOperations of the ETLMapping will be applied to. When we apply (i.e., run) an ETLOperation of the ETLMapping we get a new RTable, which is then inputed to the next ETLOperation, until we finally run all ETLOperations. The purpose of the execution of an ETLMapping is to produce a single new RTable as the result of the execution of all the ETLOperations of the ETLMapping. In terms of database operations an ETLMapping is the equivalent of an CREATE AS SELECT (CTAS) operation in an RDBMS. This means that anything that can be done in the SELECT part (i.e., column projection, row filtering, grouping and join operations, etc.) in order to produce a new table, can be included in an ETLMapping.?An ETLMapping is executed with the etl (runETLmapping) operatorImplementation: An ETLMapping is implemented as a binary tree where the node represents the ETLOperation to be executed and the left branch is another ETLMapping, while the right branch is an RTable (that might be empty in the case of a Unary ETLOperation). Execution proceeds from bottom-left to top-right. This is similar in concept to a left-deep join tree. In a Left-Deep ETLOperation tree the "pipe" of ETLOperations comes from the left branches always. The leaf node is always an ETLMapping with an ETLMapEmpty in the left branch and an RTable in the right branch (the initial RTable inputed to the ETLMapping). In this way, the result of the execution of each ETLOperation (which is an RTable) is passed on to the next ETLOperation. Here is an example: { A Left-Deep ETLOperation Tree final RTable result / etlOp3 / etlOp2 rtab2 / A leaf-node --> etlOp1 emptyRTab / ETLMapEmpty rtab1 You see that always on the left branch we have an ETLMapping data type (i.e., a left-deep ETLOperation tree). So how do we implement the following case?  final RTable result / A leaf-node --> etlOp1 / rtab1 rtab2 bThe answer is that we "model" the left RTable (rtab1 in our example) as an ETLMapping of the form: S ETLMapLD { etlOp = ETLcOp{cmap = ColMapEmpty}, tabL = ETLMapEmpty, tabR = rtab1 } So we embed the rtab1 in a ETLMapping, which is a leaf (i.e., it has an empty prevMap), the rtab1 is in the right branch (tabR) and the ETLOperation is the EmptyColMapping, which returns its input RTable when executed. We can use function 2 for this job. So it becomes ] A leaf-node --> etlOp1 / rtabToETLMapping rtab1 rtab2 >In this manner, a leaf-node can also be implemented like this: Z final RTable result / etlOp3 / etlOp2 rtab2 / A leaf-node --> etlOp1 emptyRTab / rtabToETLMapping rtab1 emptyRTable  DBFunctor an empty node DBFunctora Left-Deep node DBFunctora Right-Deep node DBFunctora Balanced node  DBFunctor#the ETLOperation to be executed   DBFunctorWthe left-branch corresponding to the previous ETLOperation, which is input to this one.  DBFunctorthe right branch corresponds to another RTable (for binary ETL operations). If this is a Unary ETLOperation then this field must be an empty RTable.  DBFunctorthe left-branch corresponds to another RTable (for binary ETL operations). If this is a Unary ETLOperation then this field must be an empty RTable.   DBFunctorXthe right branch corresponding to the previous ETLOperation, which is input to this one. DBFunctorthe left-branch corresponding to the previous ETLOperation, which is input to this one. If this is a Unary ETLOperation then this field might be an empty ETLMapping.  DBFunctorthe right branch corresponding corresponding to the previous ETLOperation, which is input to this one. -- If this is a Unary ETLOperation then this field might be an empty ETLMapping. DBFunctor7An ETL operation applied to an RTable can be either an C (a relational agebra operation like join, filter etc.) defined in  RTable.Core module, or an  applied to an e DBFunctorA Column Transformation function data type. It is used in order to define an arbitrary column-level transformation (i.e., from a list of N input Column-Values we produce a list of M derived (output) Column-Values). A Column value is represented with the N. DBFunctorQThis is the basic data type to define the column-to-column mapping from a source e to a target e. Essentially, an 3 represents the column-level transformations of an d that will yield a target d. A mapping is simply a triple of the form ( Source-Column(s), Target-Column(s), Transformation, RTuple-Filter), where we define the source columns over which a transformation (i.e. a function) will be applied in order to yield the target columns. Also, an 70 (i.e. a filter) might be applied on the source d. Remember that an dK is essentially a mapping between a key (the Column Name) and a value (the N value). So the various M data constructors below simply describe the possible modifications of an d! orginating from its own columns.So, we can have the following mapping types: a) single-source column to single-target column mapping (1 to 1), the source column will be removed or not based on the 4 flag (dublicate column names are not allowed in an d) b) multiple-source columns to single-target column mapping (N to 1), The N columns will be merged to the single target column based on the transformation. The N columns will be removed from the RTuple or not based on the 4 flag (dublicate column names are not allowed in an d) c) single-source column to multiple-target columns mapping (1 to M) the source column will be "expanded" to M target columns based ont he transformation. the source column will be removed or not based on the 4 flag (dublicate column names are not allowed in an d) d) multiple-source column to multiple target columns mapping (N to M) The N source columns will be mapped to M target columns based on the transformation. The N columns will be removed from the RTuple or not based on the 2 flag (dublicate column names are not allow in an d)+Some examples of mapping are the following:  ( Start_Date, No,  StartDate, t -> True) -- copy the source value to target and dont remove the source column, so the target RTuple will have both columns  Start_Date and  StartDateX -- with the exactly the same value) ([Amount, Discount], Yes,  FinalAmount, ([a, d] -> a * d) ) --  FinalAmount is a derived column based on a function applied to the two source columns. -- In the final RTuple we remove the two source columns. An  can be applied with the * (runColMapping) operator DBFunctor>single-source column to single-target column mapping (1 to 1). DBFunctor@multiple-source columns to single-target column mapping (N to 1) DBFunctor@single-source column to multiple-target columns mapping (1 to N) DBFunctorsmultiple-source column to multiple target columns mapping (N to M) ) DBFunctorConstructs an RColMapping. This is the suggested method for creating a column mapping and not by calling the data constructors directly.* DBFunctor&runCM operator executes an RColMapping If a target-column has the same name with a source-column and a DontRemoveSrc (i.e., removeSrcCol == No) has been specified, then the (target-column, target-value) key-value pair, overwrites the corresponding (source-column, source-value) key-value pair DBFunctorcApply an RColMapping to a source RTable and produce a new RTable. If a target-column has the same name with a source-column and a DontRemoveSrc (i.e., removeSrcCol == No) has been specified, then the (target-column, target-value) key-value pair, overwrites the corresponding (source-column, source-value) key-value pair. If a filter is embedded in the , then the returned e will include only the d$s that satisfy the filter predicate.+ DBFunctorexecutes a Unary ETL Operation DBFunctorexecutes an ETL Operation, DBFunctorexecutes a Binary ETL Operation DBFunctorexecutes an ETL Operation- DBFunctor<Creates a left-deep leaf ETL Mapping, of the following form: { A Left-Deep ETLOperation Tree final RTable result / etlOp3 / etlOp2 rtab2 / A leaf-node --> etlOp1 emptyRTab / ETLMapEmpty rtab1 . DBFunctor1creates a Binary operation leaf node of the form: Y A leaf-node --> etlOp1 / rtabToETLMapping rtab1 rtab2 / DBFunctorDConnects an ETL Mapping to a left-deep ETL Mapping tree, of the form { A Left-Deep ETLOperation Tree final RTable result / etlOp3 / etlOp2 rtab2 / A leaf-node --> etlOp1 emptyRTab / ETLMapEmpty rtab1 Example:  -- connect a Unary ETL mapping (etlOp2) etlOp2 / etlOp1 emptyRTab => connectETLMapLD etlOp2 emptyRTable prevMap -- connect a Binary ETL Mapping (etlOp3) etlOp3 / etlOp2 rtab2 => connectETLMapLD etlOp3 rtab2 prevMap Note that the right branch (RTable) appears first in the list of input arguments of this function and the left branch (ETLMapping) appears second. This is strange, and one could thought that it is a mistake (i.e., the left branch should appear first and the right branch second) since we are reading from left to right. However this was a deliberate choice, so that we leave the left branch (which is the connection point with the previous ETLMapping) as the last argument, and thus we can partially apply the argumenets and get a new function with input parameter only the previous mapping. This is very helpfull in function composition0 DBFunctorThis operator executes an  DBFunctor Executes an 1 DBFunctorThis operator executes an  and returns the  WriterI Monad that embedds apart from the resulting RTable, also the number of d s returned2 DBFunctor Model an e as an + which when executed will return the input e) DBFunctorList of source column names DBFunctorList of target column names DBFunctorColumn Transformation function DBFunctorRemove source column option DBFunctorFiltering predicate DBFunctorOutput Column Mapping DBFunctor input RTable DBFunctor output RTable DBFunctor input RTable1 DBFunctorinput RTable2  DBFunctor output RTable- DBFunctor!ETL operation of this ETL mapping DBFunctor input RTable DBFunctoroutput ETLMapping. DBFunctor!ETL operation of this ETL mapping DBFunctor input RTable1 DBFunctor input RTable2 DBFunctoroutput ETLMapping/ DBFunctor!ETL operation of this ETL Mapping DBFunctor[Right RTable (right branch) (if this is a Unary ETL mapping this should be an emptyRTable)  DBFunctor*Previous ETL mapping (left branch)  DBFunctor8New ETL Mapping, which has added at the end the new node DBFunctorinput ETLMapping DBFunctor output RTable empty ETL mapping1 DBFunctorinput ETLMapping DBFunctoroutput RTabResult/      !"#$%&'()*+,-./012/ !"#$%)     &'(*+,012-./<A simple Embedded DSL for ETL/ELT data processing in Haskell(c) Nikos Karagiannidis, 2018 BSD3 nkarag@gmail.com stable POSIX None 9 DBFunctorInternal type: We use this data type in order to identify unary vs binary operations and if the table is coming from the left or right branch6 DBFunctor?A grouping predicate clause. It defines an arbitrary function (RGroupPRedicate), which drives when two d"s should belong in the same group.8 DBFunctorsDefines the name of the column that will hold the aggregate operation result. It resembles the "AS" clause in SQL.: DBFunctor6Julius Clause to provide a custom aggregation function< DBFunctor3These are the available aggregate operation clauses> DBFunctorCount aggregation (no distinct) | CountDist ColumnName AsColumn -- ^ Count distinct aggregation. Returns the distinct number of values for this column. | CountStar AsColumn -- ^ Returns the number of d s in the e (i.e., count(*) in SQL) A DBFunctorAverage aggregationB DBFunctorA custom aggregate operationC DBFunctorAn Aggregate Operation ClauseE DBFunctor+Join Predicate Clause. It defines when two ds should be paired.G DBFunctorWThis clause is used for expressions where we do not allow the use of the Previous valueI DBFunctorA Table Expression defines the e< on which the current ETL Operation will be applied. If the K! constructor is used, then this eP is the result of the previous ETL Operations in the current Julius Expression (y)L DBFunctor3Resembles the "FROM" clause in SQL. It defines the e2 on which the Relational Operation will be appliedN DBFunctor9It is used to define an arbitrary binary operation on an eP DBFunctor8It is used to define an arbitrary unary operation on an eR DBFunctorThe Relational Operation (R') is a Julius clause that represents a Relational Algebra Operation. S DBFunctordS filtering clause (selection operation), based on an arbitrary predicate function (7)T DBFunctorColumn projection clauseU DBFunctorAggregate Operation clauseV DBFunctorDGroup By clause, based on an arbitrary Grouping predicate function ()W DBFunctorYInner Join clause, based on an arbitrary join predicate function - not just equi-join - ()X DBFunctorXLeft Join clause, based on an arbitrary join predicate function - not just equi-join - ()Y DBFunctorYRight Join clause, based on an arbitrary join predicate function - not just equi-join - ()Z DBFunctor^Full Outer Join clause, based on an arbitrary join predicate function - not just equi-join - ()[ DBFunctorIntersection clause\ DBFunctor Union clause] DBFunctor'Minus clause (set Difference operation)_ DBFunctor/This is a generic unary operation on a RTable (;). It is used to define an arbitrary unary operation on an e` DBFunctor0This is a generic binary operation on a RTable (<). It is used to define an arbitrary binary operation on an ea DBFunctorOrder By clause.b DBFunctor#A Relational Operation Expression (bP) is a sequence of one or more Relational Algebra Operations applied on a input e6. It is a sub-expression within a Julius Expression (y) and we use it whenever we want to apply relational algebra operations on an RTable (which might be the result of previous operations in a Julius Expression). A Julius Expression (y&) can contain an arbitrary number of b's. The relational operation connector d" is left associative because in a b@ operations are evaluated from left to right (or top to bottom).e DBFunctorAn d predicate clause.g DBFunctor^Indicator of whether the source column(s) in a Column Mapping will be removed or not (used in pC) If a target-column has the same name with a source-column and a i has been specified, then the (target-column, target-value) key-value pair, overwrites the corresponding (source-column, source-value) key-value pair.j DBFunctor Defines the e/ that the current operation will be applied to.l DBFunctorKDefines the column transformation function of a Column Mapping Expression (p ), the input e: that this transformation will take place, an indicator (gB) of whether the Source Columns will be removed or not in the new eL that will be created after the Column Mapping is executed and finally, an d filter predicate (e) that defines the subset of dLs that this Column Mapping will be applied to. If it must be applied to all d!s, then for the last parameter (e%), we can just provide the following 7: FilterBy (\_ -> True) n DBFunctor;Defines the Target Columns of a Column Mapping Expression (p*) and the column transformation function ().p DBFunctorA Column Mapping () is the main ETL/ELT construct for defining a column-level transformation. Essentially with a Column Mapping we can create one or more new (derived) column(s) (Target Columns2), based on an arbitrary transformation function (6) with input parameters any of the existing columns (Source Columns ). So a p is either empty, or it defines the source columns, the target columns and the transformation from source to target. Notes: * If a target-column has the same name with a source-column and a i, or a h has been specified, then the (target-column, target-value) key-value pair, overwrites the corresponding (source-column, source-value) key-value pair * The returned e will include only the d5s that satisfy the filter predicate specified in the f clause.s DBFunctor4A named intermediate result in a Julius expression (y), which we can access via the } function.u DBFunctorTThe name of an intermediate result, used as a key for accessing this result via the } function.v DBFunctorCAn ETL Operation Expression is either a Column Mapping Expression (p)), or a Relational Operation Expression (b)y DBFunctorAn ETL Mapping Expression is a "Julius Expression"j. It is a sequence of individual ETL Operation Expressions. Each such ETL Operation "acts" on some input e) and produces a new "transformed" output e. The ETL Mapping connector { (as well as the |- connector) is left associative because in a yc operations are evaluated from left to right (or top to bottom) A Named ETL Operation Expression (s) is just an ETL Operation with a name, so as to be able to reference this specific step in the chain of ETL Operations. It is actually a named intermediate resultH, which we can reference and use in other parts of our Julius expression} DBFunctor/Returns a prefix of an ETLMappingExpr that matches a named intermediate result. For example, below we show a Julius expression where we define an intermediate named result called "myResult". This result, is used at a later stage in this Julius expression, with the use of the function takeNamedResult. g etlXpression = EtlMapStart :-> (EtlC $ ...) :=> NamedResult "myResult" (EtlR $ ...) :-> (EtlR $ ... ) :-> (EtlR $ ROpStart :. (Minus (TabL $ juliusToRTable $ takeNamedResult "myResult" etlXpression -- THIS IS THE POINT WHERE WE USE THE NAMED RESULT! ) (Previous)) ) In the above Julius expression (etlXpresion) the "myResult" named result equals to the prefix of the etlXpresion, up to the operation (included) with the named result "myResult".  takeNamedResult "myResult" etlXpression == EtlMapStart :-> (EtlC $ ...) :=> NamedResult "myResult" (EtlR $ ...) Note that the julius expression is scanned from right to left and thus it will return the longest prefix expression that matches the input name~ DBFunctor9Evaluates (parses) the Julius exrpession and produces an . The ^ is an internal representation of the Julius expression and one needs to combine it with the 0> function, in order to evaluate the Julius expression into an e.. This can be achieved directly with function  DBFunctorjPure code to evaluate the "ETL-logic" of a Julius expression and generate the corresponding target RTable./The evaluation of a Julius expression (i.e., a y?) to an RTable is strict. It evaluates fully to Normal Form (NF) as opposed to a lazy evaluation (i.e., only during IO), or evaluation to a WHNF. This is for efficiency reasons (e.g., avoid space leaks and excessive memory usage). It also has the impact that exceptions will be thrown at the same line of code that 2 is called. Thus one should wrap this call with a  handler, or use , or 2, if one wants to handle the exception gracefully.Example: do catch (printRTable $ juliusToRTable $ <a Julius expression> ) (\e -> putStrLn $ "There was an error in the Julius evaluation: " ++ (show (e::SomeException)) )  Or, similarly Bdo p <- (eitherPrintRTable printRTable $ juliusToRTable $ <a Julius expression> ) :: IO (Either SomeException ()) case p of Left exc -> putStrLn $ "There was an error in the Julius evaluation: " ++ (show exc) Right _ -> return ()  DBFunctorEvaluate a Julius expression within the IO Monad. I.e., Effectful code to evaluate the "ETL-logic" of a Julius expression and generate the corresponding target RTable./The evaluation of a Julius expression (i.e., a y?) to an RTable is strict. It evaluates fully to Normal Form (NF) as opposed to a lazy evaluation (i.e., only during IO), or evaluation to a WHNF. This is for efficiency reasons (e.g., avoid space leaks and excessive memory usage). It also has the impact that exceptions will be thrown at the same line of code that 2 is called. Thus one should wrap this call with a  handler, or use 2, if he wants to handle the exception gracefully.Example: do result <- catch (runJulius $ <a Julius expression>) (e -> do putStrLn $ "there was an error in Julius evaluation: " ++ (show (e::SomeException)) return emptyRTable )  DBFunctorAEvaluate a Julius expression and return the corresponding target e or an exception. One can define custom exceptions to be thrown within a Julius expression. This function will catch any exceptions that are instances of the  type class./The evaluation of a Julius expression (i.e., a y) to an e is strict. It evaluates fully to Normal Form (NF) as opposed to a lazy evaluation (i.e., only during IO), or evaluation to a WHNF. This is for efficiency reasons (e.g., avoid space leaks and excessive memory usage). Example: ldo res <- (eitherRunJulius $ <a Julius expression>) :: IO (Either SomeException RTable) resultRTab <- case res of Right t -> return t Left exc -> do putStrLn $ "there was an error in Julius evaluation: " ++ (show exc) return emptyRTable  DBFunctorEReceives an input Julius expression, evaluates it to an ETL Mapping (+) and executes it, in order to return an  containing an eA storing the result of the ETL Mapping, as well as the number of d s returned  DBFunctor?Evaluate a Julius expression within the IO Monad and return an . DBFunctorFEvaluate a Julius expression within the IO Monad and return either an 9, or an exception, in case of an error during evaluation. DBFunctorKGeneric ETL execution function. It receives a list of input (aka "source") eEs and an ETL function that produces a list of output (aka "target") eQs. The ETL function should embed all the "transformation-logic" from the source es to the target es. DBFunctorFGeneric ETL execution function that returns either the target list of eus, or an exception in case of a problem during the ETL code execution. It receives a list of input (aka "source") eEs and an ETL function that produces a list of output (aka "target") eQs. The ETL function should embed all the "transformation-logic" from the source es to the target es. DBFunctorAEvaluates (parses) a Relational Operation Expression of the form   ROp :. ROp :. ... :. ROpStart 9and produces the corresponding ROperation data type. This will be an RCombinedOp relational operation that will be the composition of all relational operators in the ROpExpr. In the returned result the TabExpr corresponding to the left and right RTable inputs to the ROperation respectively, are also returned. DBFunctorQturns the list of agg operation expressions to a list of RAggOperation data type DBFunctor Returns an - that adds a surrogate key (SK) column to an e and fills each row with a SK value. This function is only exposed for backward compatibility reasons. The recommended function to use instead is ?, which can be embedded directly into a Julius expression as a . DBFunctor Returns an  (e -> e.) that adds a surrogate key (SK) column to an et and fills each row with a SK value. It primarily is intended to be used within a Julius expression. For example: 9 GenUnaryOp (On Tab rtab1) $ ByUnaryOp (addSurrogateKeyJ TxSK 0)  DBFunctor Returns an  that Appends an e to a target eq This function is only exposed for backward compatibility reasons. The recommended function to use instead is ?, which can be embedded directly into a Julius expression as a . DBFunctor Returns a  (e -> e -> e) that Appends an e to a target eO. It is primarily intended to be used within a Julius expression. For example: D GenBinaryOp (TabL rtab1) (Tab $ rtab2) $ ByBinaryOp appendRTableJ } DBFunctor#the name of the intermediate result DBFunctorinput ETLMapping Expression DBFunctoroutput ETLMapping Expression DBFunctorThe name of the surrogate key column -> Integer -- ^ The initial value of the Surrogate Key will be the value of this parameter  DBFunctorNThe initial value of the Surrogate Key will be the value of this parameter  DBFunctorLOutput ETL operation which encapsulates the add surrogate key column mapping DBFunctorThe name of the surrogate key column -> Integer -- ^ The initial value of the Surrogate Key will be the value of this parameter  DBFunctorNThe initial value of the Surrogate Key will be the value of this parameter  DBFunctor Input RTable DBFunctor Output RTable DBFunctorOutput ETL Operation DBFunctor Target RTable DBFunctor Input RTable DBFunctorOutput RTable 4  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNTOPQRSUVWXYZ[`\]^_abcdefghijklmnopqrstuvwxyz{|}~6789:;<=@?>ABCDEFGHIJKLMNOPQRW\TSUVXYZ[]^_`abdcefghijklmnopqrstuvwxy{|z}~Uy{|zvwxstupqrnolmjkIJKghiefbdcRW\TSUVXYZ[]^_`aLMCD<=@?>AB89:;67GHEFPQNO~}d6{5|5 Implements e3 over CSV (TSV, or any other delimiter) files logic(c) Nikos Karagiannidis, 2018 BSD3 nkarag@gmail.com stable POSIX None"#;= DBFunctor/This exception signifies an error in parsing a   to an N value DBFunctor<Exception to signify an error in decoding a CSV file into a  data type DBFunctorQOptions for a CSV file (e.g., delimiter specification, header specification etc.) DBFunctorYes or No sum type DBFunctor=Definition of a CSV Record column. (type Field = ByteString) DBFunctorhDefinition of a CSV Row. Essentially a Row is just a Vector of ByteString (type Record = Vector Field) DBFunctorIDefinition of a CSV file. Treating CSV data as opaque byte strings (see Csv type in Cassava library - Data.Csv: type Csv = Vector Record) DBFunctor.reads a CSV file and returns a lazy bytestring DBFunctorreads a CSV file and returns a 5 data type (Treating CSV data as opaque byte strings) DBFunctorTreads a CSV file based on input options (delimiter and header option) and returns a 5 data type (Treating CSV data as opaque byte strings) DBFunctor4write a CSV (bytestring) to a newly created csv file DBFunctorwrite a  to a newly created csv file DBFunctorprint input CSV on screen DBFunctor print input  on screen DBFunctor0copy input csv file to specified output csv file DBFunctorcsvToRTable: Creates an RTable from a CSV and a set of RTable Metadata. The RTable metadata essentially defines the data type of each column so as to call the appropriate data constructor of RDataType and turn the ByteString values of CSV to RDataTypes values of RTable We assume that the order of the columns in the CSV is identical with the order of the columns in the RTable metadata DBFunctor4rtableToCSV : Retunrs a CSV from an RTable The first line of the CSV will be the header line, taken from the RTable metadata. Note that the driver for creating the output CSV file is the input RTableMData descrbing the columns and RDataTypes of each RTuple. This means, that if the RTableMData include a subset of the actual columns of the input RTable, then no eror will occure and the output CSV will include only this subset. In the same token, if in the RTableMData there is a column name that is not present in the input RTable, then an error will occur. DBFunctorScsv2rtable : turn a input CSV to an RTable. The input CSV will be a ByteString. We assume that the first line is the CSV header, including the Column Names. The RTable that will be created will have as column names the headers appearing in the first line of the CSV. Internally we use CV.decodeByName to achieve this decoding where: ~ decodeByName :: FromNamedRecord a => ByteString -> Either String (Header, Vector a) + Essentially, decodeByName will return a Vector RTuplesIn order to be able to decode a CSV bytestring into an RTuple, we need to make Rtuple an instance of the FromNamesRecrd typeclass and implement the parseNamesRecord function. But this is not necessary, since there is already an instance for CV.FromNamedRecord (HM.HashMap a b), which is the same, since an RTuple is a HashMap. Also we need to make RDataType an instance of FormField ((CV.FromField RDataType)) by implementing parseField where: V parseField :: Field -> Parser a type Field = ByteString & See RTable module for these instance DBFunctorrtable2csv: encode an RTable into a CSV bytestring The first line of the CSV will be the header, which compirses of the column names.Internally we use CV.encodeByName to achieve this decoding where: @ encodeByName :: ToNamedRecord a => Header -> [a] -> ByteString Efficiently serialize CSV records as a lazy ByteString. The header is written before any records and dictates the field order.9type Header = Vector Name type Name = ByteString @"In order to encode an input RTable into a CSV bytestring we need to make Rtuple an instance of the ToNamedRecord typeclass and implement the toNamedRecord function. Where: @ toNamedRecord :: a -> NamedRecord type NamedRecord = HashMap ByteString ByteStringnamedRecord :: [(ByteString, ByteString)] -> NamedRecord Construct a named record from a list of name-value ByteString pairs. Use .= to construct such a pair from a name and a value.(.=) :: ToField a => ByteString -> a -> (ByteString, ByteString) @ In our case, we dont need to do this because an RTuple is just a synonym for HM.HashMap ColumnName RDataType and the data type HashMap a b is already an instance of ToNamedRecord.Also we need to make RDataType an instance of ToField ((CV.ToField RDataType)) by implementing toField, so as to be able to convert an RDataType into a ByteString where: L toField :: a -> Field type Field = ByteString  See e module for these instance DBFunctor creates a    (as defined in Data.Csv ) from an e= type Header = Vector Name type Name = ByteString DBFunctorO(1) First row DBFunctorKO(1) Yield all but the first row without copying. The CSV may not be empty. DBFunctor6selectNrows: Returns the first N rows from a CSV file DBFunctorkColumn projection on an input CSV file where desired columns are defined by position (index) in the CSV. DBFunctorIn order to encode an input RTable into a CSV bytestring we need to make Rtuple an instance of the ToNamedRecord typeclass and implement the toNamedRecord function. Where:  toNamedRecord :: a -> NamedRecord type NamedRecord = HashMap ByteString ByteString namedRecord :: [(ByteString, ByteString)] -> NamedRecord Construct a named record from a list of name-value ByteString pairs. Use .= to construct such a pair from a name and a value. (.=) :: ToField a => ByteString -> a -> (ByteString, ByteString) In our case, we dont need to do this because an RTuple is just a synonym for HM.HashMap ColumnName RDataType and the data type HashMap a b is already an instance of ToNamedRecord.Also we need to make RDataType an instance of ToField ((CV.ToField RDataType)) by implementing toField, so as to be able to convert an RDataType into a ByteString where: H toField :: a -> Field type Field = ByteString  DBFunctorENecessary instance in order to convert a CSV file column value to an N value. DBFunctor/CSV data are "Tabular" data thus implement the f interface  DBFunctor the CSV file DBFunctorthe output CSV DBFunctor the CSV file DBFunctorthe output CSV type DBFunctor the CSV file DBFunctorthe output CSV type DBFunctorthe csv file to be created DBFunctor input CSV DBFunctorthe csv file to be created DBFunctorinput  DBFunctor!input CSV to be printed on screen DBFunctorinput  to be printed on screen DBFunctorinput csv file DBFunctoroutput csv file DBFunctor+input RTable metadata describing the RTable DBFunctor input RTable DBFunctor output CSV DBFunctor?input CSV (we asume that this CSV has a header in the 1st line) DBFunctor output RTable DBFunctor input RTable DBFunctorOutput ByteString DBFunctorNumber of rows to select DBFunctor Input csv  DBFunctor Output csv DBFunctorinput list of column indexes DBFunctor input csv DBFunctor output CSVhggh   !"#$%&'()*+,-./0123456789:;<=>?@ABBCDEFFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABBCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~./0      (DBFunctor-0.1.0.0-4dDhI1jCY88DHDs5IjLSX3 RTable.CoreEtl.Internal.Core Etl.JuliusRTable.Data.CSVPaths_DBFunctor Data.HashMapStrictDataListData.CsvHeaderbase Control.Monad<=<GHC.Base.EmptyInputStringsInToRTimestampUnsupportedTimeStampFormatColumnDoesNotExistFormatSpecifier DefaultFormatFormat ColFormatMap RTupleFormat colSelectList colFormatMap RTabResult RTuplesRet AggFunction RAggOperation sourceCol targetColaggFuncRGroupPredicateRJoinPredicateBinaryRTableOperationUnaryRTableOperation OrderingSpecAscDesc ROperationROperationEmptyRUnionRInterRDiffRPrjRFilterRInJoin RLeftJoin RRightJoin RAggregateRGroupBy RCombinedOpRBinOpROrderBy colPrjListfpredjpredaggListgpred colGrByListrcombOprbinOp colOrdList RPredicate ColumnInfonamedtype RTupleMData RTableMDatartname rtuplemdata pkColumns uniqueKeys RTimestamp RTimestampValyearmonthdayhours24minutesseconds IgnoreDefaultIgnore NotIgnore RDataTypeRIntRTextRDateRTimeRDoubleNullrintrtextrdatedtformatrtimerdouble ColumnDTypeIntegerVarcharDate TimestampDouble RTableName ColumnNameNameRTupleRTableRTabulartoRTable fromRTable rtableToListrtableFromList rtupleToListrtupleFromListisNull isNotNull stdDateFormatgetColumnNamesfromRTabheadRTupgetColumnNamesfromRTuple rtupLookuprtupLookupDefaultgetRTupColValuenvl nvlColValuedecodeColValue nvlRTuple nvlRTable decodeRTable upsertRTuple stripRText rdtappendremoveCharAroundRText toRTimestampinstr instrText instrRTextcreateRTimestamptoTextfromTextisTextstdTimestampFormatrTimestampToRTextcreateRTableMDatatoListColumnNametoListColumnInfotoListRDataTypelistOfColInfoRDataTypecreateRDataTyperaggGenericAggraggSum raggCountraggAvgraggMaxraggMinropUrunUnaryROperationropUresrunUnaryROperationResropBrunBinaryROperationropBresrunBinaryROperationRes isRTabEmpty isRTupEmpty emptyRTable emptyRTuplecreateSingletonRTable createRtuplecreateNullRTuple isNullRTuple rtabFoldr'rdatatypeFoldr' rtabFoldl'rdatatypeFoldl'rtabMap rtuplesRet getRTuplesRet rtabResult runRTabResultexecRTabResult removeColumn addColumnf runRfilterp runProjectionrunProjectionMissedHitslimitiJ runInnerJoinO joinRTupleslJ runLeftJoinrJ runRightJoinfoJrunFullOuterJoinurunUnioni runIntersectdrunDiffrAggrunAggregationrO runOrderByrGgroupNoAggList concatRTab groupNoAgg runGroupByrCombrunCombinedROpinsertAppendRTabinsertPrependRTab updateRTabgenDefaultColFormatMapgenColFormatMapgenRTupleFormatgenRTupleFormatDefaulteitherPrintfRTable printfRTableeitherPrintRTable printRTable$fOrdRTimestamp$fEqRTimestamp$fNFDataRTimestamp$fFractionalRDataType$fNumRDataType$fOrdRDataType $fEqRDataType$fNFDataRDataType$fExceptionColumnDoesNotExist%$fExceptionUnsupportedTimeStampFormat*$fExceptionEmptyInputStringsInToRTimestamp$fShowColumnDType$fEqColumnDType$fEqIgnoreDefault$fShowIgnoreDefault$fShowRTimestamp$fReadRTimestamp$fGenericRTimestamp$fShowRDataType$fReadRDataType$fGenericRDataType$fShowColumnInfo$fEqColumnInfo$fShowRTableMData$fEqRTableMData$fShowOrderingSpec$fEqOrderingSpec$fEqFormatSpecifier$fShowFormatSpecifier$fEqRTupleFormat$fShowRTupleFormat$fEqColumnDoesNotExist$fShowColumnDoesNotExist$fEqUnsupportedTimeStampFormat $fShowUnsupportedTimeStampFormat#$fEqEmptyInputStringsInToRTimestamp%$fShowEmptyInputStringsInToRTimestamp ETLMapping ETLMapEmptyETLMapLDETLMapRD ETLMapBaletlOptabLtabRtabLrdtabRrdtabLbaltabRbal ETLOperationETLrOpETLcOpropcmapColXForm RColMapping ColMapEmptyRMap1x1RMapNx1RMap1xNRMapNxMsrcCol removeSrcColtrgCol transform1x1srcRTupleFilter srcColGrp transformNx1 trgColGrp transform1xN transformNxMYesNoYesNocreateColMappingrunCMetlOpUetlOpBcreateLeafETLMapLDcreateLeafBinETLMapLDconnectETLMapLDetletlResrtabToETLMapping$fEqETLMapping $fEqYesNo $fShowYesNo GroupOnPredGroupOnAsColumnAsAggByAggOpSumCountMinMaxAvgGenAgg AggregateAggOn TabExprJoinJoinOn TabLiteralTabLTabExprTabPrevious FromRTableFromByGenBinaryOperation ByBinaryOpByGenUnaryOperation ByUnaryOp RelationalOpFilterSelectAggGroupByJoinLJoinRJoinFOJoin IntersectUnionMinusMinusP GenUnaryOp GenBinaryOpOrderByROpExprROpStart:.ByPredFilterBy RemoveSrcCol RemoveSrc DontRemoveSrcOnRTableOn ByFunctionByToColumnTargetColMappingExprSourceColMappingEmptyNamedMap NamedResultNamedResultName ETLOpExprEtlCEtlRETLMappingExpr EtlMapStart:->:=>takeNamedResult evalJuliusjuliusToRTable runJuliuseitherRunJuliusjuliusToResultrunJuliusToResulteitherRunJuliusToResultrunETL eitherRunETLaddSurrogateKeyaddSurrogateKeyJ appendRTable appendRTableJCSVColumnToRDataTypeErrorCsvFileDecodingError CSVOptions delimiter hasHeaderColumnRowCSV readCSVFilereadCSVreadCSVwithOptions writeCSVFilewriteCSV printCSVFileprintCSVcopyCSVcsvHeaderFromRtableheadCSVtailCSV selectNrowsprojectByIndex$fToFieldRDataType$fFromFieldRDataType$fRTabularVector$fExceptionCsvFileDecodingError$$fExceptionCSVColumnToRDataTypeError$fEqCsvFileDecodingError$fShowCsvFileDecodingError$fEqCSVColumnToRDataTypeError$fShowCSVColumnToRDataTypeErrorversion getBinDir getLibDir getDynLibDir getDataDir getLibexecDir getSysconfDirgetDataFileNameString Text.Printfprintfghc-prim GHC.TypesTrue3unordered-containers-0.2.9.0-HQtYJEH7265DslRAJ09vVDData.HashMap.BaseHashMap ColumnOrderNothing text-1.2.3.0Data.Text.InternalTextcreateRTupleMdatasumFold countFoldmaxFoldminFold runInnerJoincreateOrderingSpec Data.EitherEithergetMaxLengthPerColumnFmtgetMaxLengthPerColumnaddSpace addCharacterprintContLineFmt printContLineprintRTableHeaderFmtprintRTableHeaderprintRTupleFmt printRTuplerdataTypeToStringFmtrdataTypeToString runColMappingrunUnaryETLOperationrunBinaryETLOperation runETLmappingTabExprEnhancedGHC.IOcatch GHC.Exception Exception evalROpExpraggOpExprToAggOp csvToRTable rtableToCSV csv2rtable rtable2csv