cassava-0.2.2.0: A CSV parsing and encoding library

Safe HaskellNone

Data.Csv.Parser

Description

A CSV parser. The parser defined here is RFC 4180 compliant, with the following extensions:

  • Empty lines are ignored.
  • Non-escaped fields may contain any characters except double-quotes, commas, carriage returns, and newlines.
  • Escaped fields may contain any characters (but double-quotes need to be escaped).

The functions in this module can be used to implement e.g. a resumable parser that is fed input incrementally.

Synopsis

Documentation

data DecodeOptions Source

Options that controls how data is decoded. These options can be used to e.g. decode tab-separated data instead of comma-separated data.

To avoid having your program stop compiling when new fields are added to DecodeOptions, create option records by overriding values in defaultDecodeOptions. Example:

 myOptions = defaultDecodeOptions {
       decDelimiter = fromIntegral (ord '\t')
     }

Constructors

DecodeOptions 

Fields

decDelimiter :: !Word8

Field delimiter.

defaultDecodeOptions :: DecodeOptionsSource

Decoding options for parsing CSV files.

csv :: DecodeOptions -> Parser CsvSource

Parse a CSV file that does not include a header.

csvWithHeader :: DecodeOptions -> Parser (Header, Vector NamedRecord)Source

Parse a CSV file that includes a header.

headerSource

Arguments

:: Word8

Field delimiter

-> Parser Header 

Parse a header, including the terminating line separator.

recordSource

Arguments

:: Word8

Field delimiter

-> Parser Record 

Parse a record, not including the terminating line separator. The terminating line separate is not included as the last record in a CSV file is allowed to not have a terminating line separator. You most likely want to use the endOfLine parser in combination with this parser.

name :: Word8 -> Parser NameSource

Parse a header name. Header names have the same format as regular fields.

field :: Word8 -> Parser FieldSource

Parse a field. The field may be in either the escaped or non-escaped format. The return value is unescaped.