csv-enumerator-0.10.2.0: A flexible, fast, enumerator-based CSV parser library for Haskell.

Safe HaskellNone

Data.CSV.Enumerator

Contents

Synopsis

CSV Data types

type Row = [Field]Source

class CSVeable r whereSource

Methods

rowToStr :: CSVSettings -> r -> ByteStringSource

Convert a CSV row into strict ByteString equivalent.

fileHeaders :: [r] -> Maybe RowSource

Possibly return headers for a list of rows.

iterCSV :: CSVSettings -> CSVAction r a -> a -> Iteratee ByteString IO aSource

The raw iteratee to process any Enumerator stream

fileSink :: CSVSettings -> FilePath -> (Maybe Handle, Int) -> ParsedRow r -> Iteratee ByteString IO (Maybe Handle, Int)Source

Iteratee to push rows into a given file

mapCSVFilesSource

Arguments

:: [FilePath]

Input files

-> CSVSettings

CSV Settings

-> (r -> [r])

A function to map a row onto rows

-> FilePath

Output file

-> IO (Either SomeException Int)

Number of rows processed

Like mapCSVFile but operates on multiple files pouring results into a single file.

Instances

data ParsedRow r Source

A datatype that incorporates the signaling of parsing status to the user-developed iteratee.

We need this because some iteratees do interleaved IO (such as outputting to a file via a handle inside the accumulator) and some final actions may need to be taken upon encountering EOF (such as closing the interleaved handle).

Use this datatype when developing iteratees for use with fold* family of functions (Row enumarators).

Constructors

ParsedRow (Maybe r) 
EOF 

CSV Setttings

data CSVSettings Source

Settings for a CSV file. This library is intended to be flexible and offer a way to process the majority of text data files out there.

Constructors

CSVS 

Fields

csvSep :: !Char

Separator character to be used in between fields

csvQuoteChar :: !(Maybe Char)

Quote character that may sometimes be present around fields. If Nothing is given, the library will never expect quotation even if it is present.

csvOutputQuoteChar :: !(Maybe Char)

Quote character that should be used in the output.

csvOutputColSep :: !Char

Field separator that should be used in the output.

defCSVSettings :: CSVSettingsSource

Default settings for a CSV file.

 csvSep = ','
 csvQuoteChar = Just '"'
 csvOutputQuoteChar = Just '"'
 csvOutputColSep = ','

Reading / Writing CSV Files

These are some simple file-related operations for basic use cases.

readCSVFileSource

Arguments

:: CSVeable r 
=> CSVSettings

CSV settings

-> FilePath

FilePath

-> IO (Either SomeException [r])

Collected data

writeCSVFileSource

Arguments

:: CSVeable r 
=> CSVSettings

CSV settings

-> FilePath

Target file path

-> [r]

Data to be output

-> IO Int

Number of rows written

appendCSVFileSource

Arguments

:: CSVeable r 
=> CSVSettings

CSV settings

-> FilePath

Target file path

-> [r]

Data to be output

-> IO Int

Number of rows written

Very Basic CSV Operations (for Debugging or Quick&Dirty Needs)

parseCSV :: CSVSettings -> ByteString -> Either String [Row]Source

Try to parse given string as CSV

parseRow :: CSVSettings -> ByteString -> Either String (Maybe Row)Source

Try to parse given string as Row

Generic Folds Over CSV Files

These operations enable you to do whatever you want with CSV files; including interleaved IO, etc.

foldCSVFileSource

Arguments

:: CSVeable r 
=> FilePath

File to open as a CSV file

-> CSVSettings

CSV settings to use on the input file

-> CSVAction r a

Fold action

-> a

Initial accumulator

-> IO (Either SomeException a)

Error or the resulting accumulator

Open & fold over the CSV file.

Processing starts on row 2 for MapRow instance to use first row as column headers.

type CSVAction r a = a -> ParsedRow r -> Iteratee ByteString IO aSource

An iteratee that processes each row of a CSV file and updates the accumulator.

You would implement one of these to use with the foldCSVFile function.

funToIter :: CSVeable r => (a -> ParsedRow r -> a) -> CSVAction r aSource

Convenience converter for fold step functions that are pure.

Use this if you don't want to deal with Iteratees when writing your fold functions.

funToIterIO :: CSVeable r => (a -> ParsedRow r -> IO a) -> CSVAction r aSource

Convenience converter for fold step functions that live in the IO monad.

Use this if you don't want to deal with Iteratees when writing your fold functions.

Mapping Over CSV Files

mapCSVFileSource

Arguments

:: CSVeable r 
=> FilePath

Input file

-> CSVSettings

CSV Settings

-> (r -> [r])

A function to map a row onto rows

-> FilePath

Output file

-> IO (Either SomeException Int)

Number of rows processed

Take a CSV file, apply function to each of its rows and save the resulting rows into a new file.

Each row is simply a list of fields.

mapCSVFileMSource

Arguments

:: CSVeable r 
=> FilePath

Input file

-> CSVSettings

CSV Settings

-> (r -> IO [r])

A function to map a row onto rows

-> FilePath

Output file

-> IO (Either SomeException Int)

Number of rows processed

Take a CSV file, apply an IO action to each of its rows and save the resulting rows into a new file.

Each row is simply a list of fields.

mapCSVFileM_Source

Arguments

:: CSVeable r 
=> FilePath

Input file

-> CSVSettings

CSV Settings

-> (r -> IO a)

A function to process rows

-> IO (Either SomeException Int)

Number of rows processed

Take a CSV file, apply an IO action to each of its rows and discard the results.

mapAccumCSVFile :: CSVeable r => FilePath -> CSVSettings -> (acc -> r -> (acc, [r])) -> acc -> FilePath -> IO (Either SomeException acc)Source

Map-accumulate over a CSV file. Similar to mapAccumL in List.

mapIntoHandleSource

Arguments

:: CSVeable r 
=> CSVSettings

CSVSettings

-> Bool

Whether to write headers

-> Handle

Handle to stream results

-> (r -> IO [r])

Map function

-> Iteratee ByteString IO Int

Resulting Iteratee

Create an iteratee that can map over a CSV stream and output results to a handle in an interleaved fashion.

Example use: Let's map over a CSV file coming in through stdin and push results to stdout.

 f r = return [r] -- a function that just returns the given row
 E.run (E.enumHandle 4096 stdin $$ mapIntoHandle defCSVSettings True stdout f)

This nicely allows us to do things like (assuming you have pv installed):

 pv inputFile.csv | myApp > output.CSV

And monitor the ongoing progress of processing.

Primitive Iteratees

collectRows :: CSVeable r => CSVAction r [r]Source

Just collect all rows into an array. This will cancel out the incremental nature of this library.

Other Utilities

outputRow :: CSVeable r => CSVSettings -> Handle -> r -> IO ()Source

Output given row into given handle

outputColumns :: CSVSettings -> Handle -> [ByteString] -> MapRow -> IO ()Source

Expand or contract the given MapRow to contain exactly the given set of columns and then write the row into the given Handle.

This is helpful in filtering the columns or perhaps combining a number of files that don't have the same columns.

Missing columns will be left empty.