- type Row = [Field]
- type Field = ByteString
- type MapRow = Map ByteString ByteString
- class CSVeable r where
- rowToStr :: CSVSettings -> r -> ByteString
- fileHeaders :: [r] -> Maybe Row
- iterCSV :: CSVSettings -> CSVAction r a -> a -> Iteratee ByteString IO a
- fileSink :: CSVSettings -> FilePath -> (Maybe Handle, Int) -> ParsedRow r -> Iteratee ByteString IO (Maybe Handle, Int)
- mapCSVFiles :: [FilePath] -> CSVSettings -> (r -> [r]) -> FilePath -> IO (Either SomeException Int)
- data CSVeable r => ParsedRow r
- data CSVSettings = CSVS {
- csvSep :: !Char
- csvQuoteChar :: !(Maybe Char)
- csvOutputQuoteChar :: !(Maybe Char)
- csvOutputColSep :: !Char
- defCSVSettings :: CSVSettings
- readCSVFile :: CSVeable r => CSVSettings -> FilePath -> IO (Either SomeException [r])
- writeCSVFile :: CSVeable r => CSVSettings -> FilePath -> [r] -> IO Int
- appendCSVFile :: CSVeable r => CSVSettings -> FilePath -> [r] -> IO Int
- foldCSVFile :: CSVeable r => FilePath -> CSVSettings -> CSVAction r a -> a -> IO (Either SomeException a)
- type CSVAction r a = a -> ParsedRow r -> Iteratee ByteString IO a
- funToIter :: CSVeable r => (a -> ParsedRow r -> a) -> CSVAction r a
- funToIterIO :: CSVeable r => (a -> ParsedRow r -> IO a) -> CSVAction r a
- mapCSVFile :: CSVeable r => FilePath -> CSVSettings -> (r -> [r]) -> FilePath -> IO (Either SomeException Int)
- mapAccumCSVFile :: CSVeable r => FilePath -> CSVSettings -> (acc -> r -> (acc, [r])) -> acc -> FilePath -> IO (Either SomeException acc)
- mapIntoHandle :: CSVeable r => CSVSettings -> Bool -> Handle -> (r -> IO [r]) -> Iteratee ByteString IO Int
- collectRows :: CSVeable r => CSVAction r [r]
- outputRowIter :: CSVeable r => CSVSettings -> Handle -> r -> Iteratee ByteString IO ()
- outputRowsIter :: CSVeable r => CSVSettings -> Handle -> [r] -> Iteratee ByteString IO ()
- outputRow :: CSVeable r => CSVSettings -> Handle -> r -> IO ()
- outputRows :: CSVeable r => CSVSettings -> Handle -> [r] -> IO ()
- outputColumns :: CSVSettings -> Handle -> [ByteString] -> MapRow -> IO ()
- writeHeaders :: CSVeable r => CSVSettings -> Handle -> [r] -> IO Bool
CSV Data types
type Field = ByteStringSource
type MapRow = Map ByteString ByteStringSource
rowToStr :: CSVSettings -> r -> ByteStringSource
Convert a CSV row into strict ByteString equivalent.
fileHeaders :: [r] -> Maybe RowSource
Possibly return headers for a list of rows.
iterCSV :: CSVSettings -> CSVAction r a -> a -> Iteratee ByteString IO aSource
The raw iteratee to process any Enumerator stream
fileSink :: CSVSettings -> FilePath -> (Maybe Handle, Int) -> ParsedRow r -> Iteratee ByteString IO (Maybe Handle, Int)Source
Iteratee to push rows into a given file
:: [FilePath] | Input files |
-> CSVSettings | CSV Settings |
-> (r -> [r]) | A function to map a row onto rows |
-> FilePath | Output file |
-> IO (Either SomeException Int) | Number of rows processed |
Like mapCSVFile
but operates on multiple files pouring results into
a single file.
data CSVeable r => ParsedRow r Source
A datatype that incorporates the signaling of parsing status to the user-developed iteratee.
We need this because some iteratees do interleaved IO (such as outputting to a file via a handle inside the accumulator) and some final actions may need to be taken upon encountering EOF (such as closing the interleaved handle).
Use this datatype when developing iteratees for use with fold* family of functions (Row enumarators).
CSV Setttings
data CSVSettings Source
Settings for a CSV file. This library is intended to be flexible and offer a way to process the majority of text data files out there.
CSVS | |
|
defCSVSettings :: CSVSettingsSource
Default settings for a CSV file.
csvSep = ',' csvQuoteChar = Just '"' csvOutputQuoteChar = Just '"' csvOutputColSep = ','
Reading / Writing CSV Files
These are some simple file-related operations for basic use cases.
:: CSVeable r | |
=> CSVSettings | CSV settings |
-> FilePath | FilePath |
-> IO (Either SomeException [r]) | Collected data |
:: CSVeable r | |
=> CSVSettings | CSV settings |
-> FilePath | Target file path |
-> [r] | Data to be output |
-> IO Int | Number of rows written |
:: CSVeable r | |
=> CSVSettings | CSV settings |
-> FilePath | Target file path |
-> [r] | Data to be output |
-> IO Int | Number of rows written |
Generic Folds Over CSV Files
These operations enable you to do whatever you want with CSV files; including interleaved IO, etc.
:: CSVeable r | |
=> FilePath | File to open as a CSV file |
-> CSVSettings | CSV settings to use on the input file |
-> CSVAction r a | Fold action |
-> a | Initial accumulator |
-> IO (Either SomeException a) | Error or the resulting accumulator |
Open & fold over the CSV file.
Processing starts on row 2 for MapRow instance to use first row as column headers.
type CSVAction r a = a -> ParsedRow r -> Iteratee ByteString IO aSource
An iteratee that processes each row of a CSV file and updates the accumulator.
You would implement one of these to use with the foldCSVFile
function.
funToIter :: CSVeable r => (a -> ParsedRow r -> a) -> CSVAction r aSource
Convenience converter for fold step functions that are pure.
Use this if you don't want to deal with Iteratees when writing your fold functions.
funToIterIO :: CSVeable r => (a -> ParsedRow r -> IO a) -> CSVAction r aSource
Convenience converter for fold step functions that live in the IO monad.
Use this if you don't want to deal with Iteratees when writing your fold functions.
Mapping Over CSV Files
:: CSVeable r | |
=> FilePath | Input file |
-> CSVSettings | CSV Settings |
-> (r -> [r]) | A function to map a row onto rows |
-> FilePath | Output file |
-> IO (Either SomeException Int) | Number of rows processed |
Take a CSV file, apply function to each of its rows and save the resulting rows into a new file.
Each row is simply a list of fields.
mapAccumCSVFile :: CSVeable r => FilePath -> CSVSettings -> (acc -> r -> (acc, [r])) -> acc -> FilePath -> IO (Either SomeException acc)Source
Map-accumulate over a CSV file. Similar to mapAccumL
in Data.List
.
:: CSVeable r | |
=> CSVSettings | |
-> Bool | Whether to write headers |
-> Handle | Handle to stream results |
-> (r -> IO [r]) | Map function |
-> Iteratee ByteString IO Int | Resulting Iteratee |
Create an iteratee that can map over a CSV stream and output results to a handle in an interleaved fashion.
Example use: Let's map over a CSV file coming in through stdin
and push
results to stdout
.
f r = return [r] -- a function that just returns the given row
E.run (E.enumHandle 4096 stdin $$ mapIntoHandle defCSVSettings True stdout f)
This nicely allows us to do things like (assuming you have pv installed):
pv inputFile.csv | myApp > output.CSV
And monitor the ongoing progress of processing.
Primitive Iteratees
collectRows :: CSVeable r => CSVAction r [r]Source
Just collect all rows into an array. This will cancel out the incremental nature of this library.
outputRowIter :: CSVeable r => CSVSettings -> Handle -> r -> Iteratee ByteString IO ()Source
outputRowsIter :: CSVeable r => CSVSettings -> Handle -> [r] -> Iteratee ByteString IO ()Source
Other Utilities
outputRow :: CSVeable r => CSVSettings -> Handle -> r -> IO ()Source
Output given row into given handle
outputRows :: CSVeable r => CSVSettings -> Handle -> [r] -> IO ()Source
outputColumns :: CSVSettings -> Handle -> [ByteString] -> MapRow -> IO ()Source
writeHeaders :: CSVeable r => CSVSettings -> Handle -> [r] -> IO BoolSource