csv-conduit-0.7.1.0: A flexible, fast, conduit-based CSV parser library for Haskell.

Safe HaskellNone
LanguageHaskell98

Data.CSV.Conduit

Contents

Synopsis

Main Interface

decodeCSV :: forall v a s. (Vector v a, CSV s a) => CSVSettings -> s -> Either SomeException (v a) Source #

A simple way to decode a CSV string. Don't be alarmed by the polymorphic nature of the signature. s is the type for the string and v is a kind of Vector here.

For example for ByteString:

>>> s <- LB.readFile "my.csv"
>>> decodeCSV defCSVSettings s :: Either SomeException (Vector (Vector ByteString))

will work as long as the data is comma separated.

readCSVFile Source #

Arguments

:: (MonadIO m, CSV ByteString a) 
=> CSVSettings

Settings to use in deciphering stream

-> FilePath

Input file

-> m (Vector a) 

Read the entire contents of a CSV file into memory.

writeCSVFile Source #

Arguments

:: CSV ByteString a 
=> CSVSettings

CSV Settings

-> FilePath

Target file

-> IOMode

Write vs. append mode

-> [a]

List of rows

-> IO () 

Write CSV data into file. As we use a ByteString sink, you'll need to get your data into a ByteString stream type.

transformCSV Source #

Arguments

:: (MonadThrow m, CSV s a, CSV s' b) 
=> CSVSettings

Settings to be used for both input and output

-> ConduitM () s m ()

A raw stream data source. Ex: 'sourceFile inFile'

-> ConduitM a b m ()

A transforming conduit

-> ConduitM s' Void m ()

A raw stream data sink. Ex: 'sinkFile outFile'

-> m () 

Like transformCSV' but uses the same settings for both input and output.

transformCSV' Source #

Arguments

:: (MonadThrow m, CSV s a, CSV s' b) 
=> CSVSettings

Settings to be used for input

-> CSVSettings

Settings to be used for output

-> ConduitM () s m ()

A raw stream data source. Ex: 'sourceFile inFile'

-> ConduitM a b m ()

A transforming conduit

-> ConduitM s' Void m ()

A raw stream data sink. Ex: 'sinkFile outFile'

-> m () 

General purpose CSV transformer. Apply a list-like processing function from List to the rows of a CSV stream. You need to provide a stream data source, a transformer and a stream data sink.

An easy way to run this function would be runResourceT after feeding it all the arguments.

Example - map a function over the rows of a CSV file:

transformCSV setIn setOut (sourceFile inFile) (C.map f) (sinkFile outFile)

mapCSVFile Source #

Arguments

:: (MonadResource m, CSV ByteString a, CSV ByteString b, MonadThrow m) 
=> CSVSettings

Settings to use both for both input and output

-> (a -> [b])

A mapping function

-> FilePath

Input file

-> FilePath

Output file

-> m () 

Map over the rows of a CSV file. Provided for convenience for historical reasons.

An easy way to run this function would be runResourceT after feeding it all the arguments.

writeHeaders :: (Monad m, CSV s (Row r), IsString s) => CSVSettings -> ConduitM (MapRow r) s m () Source #

Write headers AND the row into the output stream, once. If you don't call this while using MapRow family of row types, then your resulting output will NOT have any headers in it.

Usage: Just chain this using the Monad instance in your pipeline:

runConduit $ ... .| writeHeaders settings >> fromCSV settings .| sinkFile "..."

class CSV s r where Source #

Represents types r that are CSV-like and can be converted to/from an underlying stream of type s. There is nothing scary about the type:

s represents stream types that can be converted to/from CSV rows. Examples are ByteString, Text and String.

r represents the target CSV row representations that this library can work with. Examples are the Row types, the Record type and the MapRow family of types. We can also convert directly to complex Haskell types using the Conversion module that was borrowed from the cassava package, which was itself inspired by the aeson package.

Example #1: Basics Using Convenience API

import Data.Conduit
import Data.Conduit.Binary
import Data.Conduit.List as CL
import Data.CSV.Conduit

myProcessor :: Conduit (Row Text) m (Row Text)
myProcessor = CL.map reverse

test = runResourceT $
  transformCSV defCSVSettings
               (sourceFile "input.csv")
               myProcessor
               (sinkFile "output.csv")

Example #2: Basics Using Conduit API

import Data.Conduit
import Data.Conduit.Binary
import Data.CSV.Conduit

myProcessor :: Conduit (MapRow Text) m (MapRow Text)
myProcessor = undefined

test = runResourceT $ runConduit $
  sourceFile "test/BigFile.csv" .|
  intoCSV defCSVSettings .|
  myProcessor .|
  (writeHeaders defCSVSettings >> fromCSV defCSVSettings) .|
  sinkFile "test/BigFileOut.csv"

Methods

rowToStr :: CSVSettings -> r -> s Source #

Convert a CSV row into strict ByteString equivalent.

intoCSV :: MonadThrow m => CSVSettings -> ConduitM s r m () Source #

Turn a stream of s into a stream of CSV row type. An example would be parsing a ByteString stream as rows of MapRow Text.

fromCSV :: Monad m => CSVSettings -> ConduitM r s m () Source #

Turn a stream of CSV row type back into a stream of s. An example would be rendering a stream of Row ByteString rows as Text.

Instances
(FromNamedRecord a, ToNamedRecord a, CSV s (MapRow ByteString)) => CSV s (Named a) Source #

Conversion of stream directly to/from a custom complex haskell type.

Instance details

Defined in Data.CSV.Conduit

(CSV s (Row s'), Ord s', IsString s) => CSV s (MapRow s') Source #

Generic MapRow instance; any stream type with a Row instance automatically gets a MapRow instance.

Instance details

Defined in Data.CSV.Conduit

CSV s (Row s) => CSV s (Vector s) Source #

Support for parsing rows in the Vector form.

Instance details

Defined in Data.CSV.Conduit

CSV ByteString (Row ByteString) Source #

Row instance using ByteString

Instance details

Defined in Data.CSV.Conduit

CSV ByteString (Row Text) Source #

Row instance using Text based on ByteString stream

Instance details

Defined in Data.CSV.Conduit

CSV ByteString (Row String) Source #

Row instance using String based on ByteString stream. Please note this uses the ByteString operations underneath and has lots of unnecessary overhead. Included for convenience.

Instance details

Defined in Data.CSV.Conduit

CSV Text (Row Text) Source #

Row instance using Text

Instance details

Defined in Data.CSV.Conduit

data CSVSettings Source #

Settings for a CSV file. This library is intended to be flexible and offer a way to process the majority of text data files out there.

Constructors

CSVSettings 

Fields

  • csvSep :: !Char

    Separator character to be used in between fields

  • csvQuoteChar :: !(Maybe Char)

    Quote character that may sometimes be present around fields. If Nothing is given, the library will never expect quotation even if it is present.

defCSVSettings :: CSVSettings Source #

Default settings for a CSV file.

csvSep = ','
csvQuoteChar = Just '"'

type MapRow a = Map a a Source #

A MapRow is a dictionary based on Map where column names are keys and row's individual cell values are the values of the Map.

type Row a = [a] Source #

A Row is just a list of fields

Re-exported For Convenience

runResourceT :: MonadUnliftIO m => ResourceT m a -> m a #

Unwrap a ResourceT transformer, and call all registered release actions.

Note that there is some reference counting involved due to resourceForkIO. If multiple threads are sharing the same collection of resources, only the last call to runResourceT will deallocate the resources.

NOTE Since version 1.2.0, this function will throw a ResourceCleanupException if any of the cleanup functions throw an exception.

Since: resourcet-0.3.0