csv-conduit-0.6.2.1: A flexible, fast, conduit-based CSV parser library for Haskell.

Safe HaskellNone

Data.CSV.Conduit

Contents

Synopsis

Main Interface

decodeCSV :: (Vector v a, CSV s a) => CSVSettings -> s -> Either SomeException (v a)Source

A simple way to decode a CSV string. Don't be alarmed by the polymorphic nature of the signature. s is the type for the string and v is a kind of Vector here.

For example for ByteString:

>>> s <- LB.readFile "my.csv"
>>> decodeCSV 'def' s :: Vector (Vector ByteString)

will just work.

readCSVFile :: (MonadIO m, CSV ByteString a) => CSVSettings -> FilePath -> m (Vector a)Source

Read the entire contents of a CSV file into memory. readCSVFile :: (GV.Vector v a, CSV ByteString a) => CSVSettings -- ^ Settings to use in deciphering stream -> FilePath -- ^ Input file -> IO (v a)

writeCSVFileSource

Arguments

:: CSV ByteString a 
=> CSVSettings

CSV Settings

-> FilePath

Target file

-> IOMode

Write vs. append mode

-> [a]

List of rows

-> IO () 

Write CSV data into file. As we use a ByteString sink, you'll need to get your data into a ByteString stream type.

transformCSVSource

Arguments

:: (MonadThrow m, CSV s a, CSV s' b) 
=> CSVSettings

Settings to be used for both input and output

-> Source m s

A raw stream data source. Ex: 'sourceFile inFile'

-> Conduit a m b

A transforming conduit

-> Sink s' m ()

A raw stream data sink. Ex: 'sinkFile outFile'

-> m () 

General purpose CSV transformer. Apply a list-like processing function from List to the rows of a CSV stream. You need to provide a stream data source, a transformer and a stream data sink.

An easy way to run this function would be runResourceT after feeding it all the arguments.

Example - map a function over the rows of a CSV file:

 transformCSV set (sourceFile inFile) (C.map f) (sinkFile outFile)

mapCSVFileSource

Arguments

:: (MonadResource m, MonadThrow m, CSV ByteString a, CSV ByteString b) 
=> CSVSettings

Settings to use both for both input and output

-> (a -> [b])

A mapping function

-> FilePath

Input file

-> FilePath

Output file

-> m () 

Map over the rows of a CSV file. Provided for convenience for historical reasons.

An easy way to run this function would be runResourceT after feeding it all the arguments.

writeHeaders :: (Monad m, CSV s (Row r), IsString s) => CSVSettings -> Conduit (MapRow r) m sSource

Write headers AND the row into the output stream, once. If you don't call this while using MapRow family of row types, then your resulting output will NOT have any headers in it.

Usage: Just chain this using the Monad instance in your pipeline:

 ... =$= writeHeaders settings >> fromCSV settings $$ sinkFile "..."

class CSV s r whereSource

Represents types r that are CSV-like and can be converted to/from an underlying stream of type s. There is nothing scary about the type:

s represents stream types that can be converted to/from CSV rows. Examples are ByteString, Text and String.

r represents the target CSV row representations that this library can work with. Examples are the Row types, the Record type and the MapRow family of types. We can also convert directly to complex Haskell types using the Conversion module that was borrowed from the cassava package, which was itself inspired by the aeson package.

Example #1: Basics Using Convenience API

import Data.Conduit
import Data.Conduit.Binary
import Data.Conduit.List as CL
import Data.CSV.Conduit

myProcessor :: Conduit (Row Text) m (Row Text)
myProcessor = CL.map reverse

test = runResourceT $
  transformCSV defCSVSettings
               (sourceFile "input.csv")
               myProcessor
               (sinkFile "output.csv")

Example #2: Basics Using Conduit API

import Data.Conduit
import Data.Conduit.Binary
import Data.CSV.Conduit

myProcessor :: Conduit (MapRow Text) m (MapRow Text)
myProcessor = undefined

test = runResourceT $
  sourceFile "test/BigFile.csv" $=
  intoCSV defCSVSettings $=
  myProcessor $=
  (writeHeaders defCSVSettings >> fromCSV defCSVSettings) $$
  sinkFile "test/BigFileOut.csv"

Methods

rowToStr :: CSVSettings -> r -> sSource

Convert a CSV row into strict ByteString equivalent.

intoCSV :: MonadThrow m => CSVSettings -> Conduit s m rSource

Turn a stream of s into a stream of CSV row type. An example would be parsing a ByteString stream as rows of MapRow Text.

fromCSV :: Monad m => CSVSettings -> Conduit r m sSource

Turn a stream of CSV row type back into a stream of s. An example would be rendering a stream of Row ByteString rows as Text.

Instances

(FromNamedRecord a, ToNamedRecord a, CSV s (MapRow ByteString)) => CSV s (Named a)

Conversion of stream directly to/from a custom complex haskell type.

(CSV s (Row s'), Ord s', IsString s) => CSV s (MapRow s')

Generic MapRow instance; any stream type with a Row instance automatically gets a MapRow instance.

CSV s (Row s) => CSV s (Vector s)

Support for parsing rows in the Vector form.

CSV ByteString (Row String)

Row instance using String based on ByteString stream. Please note this uses the ByteString operations underneath and has lots of unnecessary overhead. Included for convenience.

CSV ByteString (Row ByteString)

Row instance using ByteString

CSV ByteString (Row Text)

Row instance using Text based on ByteString stream

CSV Text (Row Text)

Row instance using Text

data CSVSettings Source

Settings for a CSV file. This library is intended to be flexible and offer a way to process the majority of text data files out there.

Constructors

CSVSettings 

Fields

csvSep :: !Char

Separator character to be used in between fields

csvQuoteChar :: !(Maybe Char)

Quote character that may sometimes be present around fields. If Nothing is given, the library will never expect quotation even if it is present.

defCSVSettings :: CSVSettingsSource

Default settings for a CSV file.

 csvSep = ','
 csvQuoteChar = Just '"'

type MapRow a = Map a aSource

A MapRow is a dictionary based on Map where column names are keys and row's individual cell values are the values of the Map.

type Row a = [a]Source

A Row is just a list of fields

Re-exported For Convenience

runResourceT :: MonadBaseControl IO m => ResourceT m a -> m a

Unwrap a ResourceT transformer, and call all registered release actions.

Note that there is some reference counting involved due to resourceForkIO. If multiple threads are sharing the same collection of resources, only the last call to runResourceT will deallocate the resources.

Since 0.3.0