csv-conduit-0.4.1: A flexible, fast, conduit-based CSV parser library for Haskell.

Safe HaskellNone

Data.CSV.Conduit

Contents

Synopsis

Key Operations

class CSV s r whereSource

Represents types r that are CSV-like and can be converted to/from an underlying stream of type s.

Example #1: Basics Using Convenience API

import Data.Conduit
import Data.Conduit.Binary
import Data.Conduit.List as CL
import Data.CSV.Conduit

myProcessor :: Conduit (Row Text) m (Row Text)
myProcessor = CL.map reverse

test = runResourceT $
  transformCSV defCSVSettings
               (sourceFile "input.csv")
               myProcessor
               (sinkFile "output.csv")

Example #2: Basics Using Conduit API

import Data.Conduit
import Data.Conduit.Binary
import Data.CSV.Conduit

myProcessor :: Conduit (MapRow Text) m (MapRow Text)
myProcessor = undefined

test = runResourceT $
  sourceFile "test/BigFile.csv" $=
  intoCSV defCSVSettings $=
  myProcessor $=
  (writeHeaders defCSVSettings >> fromCSV defCSVSettings) $$
  sinkFile "test/BigFileOut.csv"

Methods

rowToStr :: CSVSettings -> r -> sSource

Convert a CSV row into strict ByteString equivalent.

intoCSV :: MonadThrow m => CSVSettings -> GLInfConduit s m rSource

Turn a stream of s into a stream of CSV row type. An example would be parsing a ByteString stream as rows of MapRow Text.

fromCSV :: Monad m => CSVSettings -> GInfConduit r m sSource

Turn a stream of CSV row type back into a stream of s. An example would be rendering a stream of Row ByteString rows as Text.

Instances

(CSV s (Row s'), Ord s', IsString s) => CSV s (MapRow s')

Generic MapRow instance; any stream type with a Row instance automatically gets a MapRow instance.

CSV ByteString (Row String)

Row instance using String based on ByteString stream. Please note this uses the ByteString operations underneath and has lots of unnecessary overhead. Included for convenience.

CSV ByteString (Row ByteString)

Row instance using ByteString

CSV ByteString (Row Text)

Row instance using Text based on ByteString stream

CSV Text (Row Text)

Row instance using Text

writeHeaders :: (Monad m, CSV s (Row r), IsString s) => CSVSettings -> GConduit (MapRow r) m sSource

Write headers AND the row into the output stream, once. Just chain this using the Monad instance in your pipeline:

 ... =$= writeHeaders settings >> fromCSV settings $$ sinkFile "..."

Convenience Functions

readCSVFileSource

Arguments

:: CSV ByteString a 
=> CSVSettings

Settings to use in deciphering stream

-> FilePath

Input file

-> IO [a] 

Read the entire contents of a CSV file into memory.

An easy way to run this function would be runResourceT after feeding it all the arguments.

writeCSVFileSource

Arguments

:: CSV ByteString a 
=> CSVSettings

CSV Settings

-> FilePath

Target file

-> IOMode

Write vs. append mode

-> [a]

List of rows

-> IO () 

Write CSV data into file.

transformCSVSource

Arguments

:: (MonadThrow m, CSV s a, CSV s' b) 
=> CSVSettings

Settings to be used for input and output

-> Source m s

A raw stream data source. Ex: 'sourceFile inFile'

-> Conduit a m b

A transforming conduit

-> Sink s' m ()

A raw stream data sink. Ex: 'sinkFile outFile'

-> m () 

General purpose CSV transformer. Apply a list-like processing function from List to the rows of a CSV stream. You need to provide a stream data source, a transformer and a stream data sink.

An easy way to run this function would be runResourceT after feeding it all the arguments.

Example - map a function over the rows of a CSV file:

 transformCSV set (sourceFile inFile) (C.map f) (sinkFile outFile)

mapCSVFileSource

Arguments

:: (MonadResource m, MonadThrow m, CSV ByteString a, CSV ByteString b) 
=> CSVSettings

Settings to use both for input and output

-> (a -> [b])

A mapping function

-> FilePath

Input file

-> FilePath

Output file

-> m () 

Map over the rows of a CSV file. Provided for convenience for historical reasons.

An easy way to run this function would be runResourceT after feeding it all the arguments.

Important Types

data CSVSettings Source

Settings for a CSV file. This library is intended to be flexible and offer a way to process the majority of text data files out there.

Constructors

CSVS 

Fields

csvSep :: !Char

Separator character to be used in between fields

csvQuoteChar :: !(Maybe Char)

Quote character that may sometimes be present around fields. If Nothing is given, the library will never expect quotation even if it is present.

csvOutputQuoteChar :: !(Maybe Char)

Quote character that should be used in the output.

csvOutputColSep :: !Char

Field separator that should be used in the output.

defCSVSettings :: CSVSettingsSource

Default settings for a CSV file.

 csvSep = ','
 csvQuoteChar = Just '"'
 csvOutputQuoteChar = Just '"'
 csvOutputColSep = ','

type MapRow a = Map a aSource

A MapRow is a dictionary based on Map

type Row a = [a]Source

A Row is just a list of fields

Re-exported For Convenience

runResourceT :: MonadBaseControl IO m => ResourceT m a -> m a

Unwrap a ResourceT transformer, and call all registered release actions.

Note that there is some reference counting involved due to resourceForkIO. If multiple threads are sharing the same collection of resources, only the last call to runResourceT will deallocate the resources.

Since 0.3.0