Safe Haskell | None |
---|---|
Language | Haskell2010 |
A CSV parser. The parser defined here is RFC 4180 compliant, with the following extensions:
- Empty lines are ignored.
- Non-escaped fields may contain any characters except double-quotes, commas, carriage returns, and newlines.
- Escaped fields may contain any characters (but double-quotes need to be escaped).
The functions in this module can be used to implement e.g. a resumable parser that is fed input incrementally.
- byteStringChar8 :: Siphon ByteString
- encodeRow :: Vector (Escaped ByteString) -> ByteString
- escape :: ByteString -> Escaped ByteString
- escapeAlways :: ByteString -> Escaped ByteString
- sepByDelim1' :: Parser a -> Word8 -> Parser [a]
- sepByEndOfLine1' :: Parser a -> Parser [a]
- row :: Word8 -> Parser (Vector ByteString)
- rowNoNewline :: Word8 -> Parser (Vector ByteString)
- removeBlankLines :: [Vector ByteString] -> [Vector ByteString]
- field :: Word8 -> Parser ByteString
- escapedField :: Parser ByteString
- unescapedField :: Word8 -> Parser ByteString
- dquote :: Parser Char
- unescape :: Parser ByteString
- (<$!>) :: Monad m => (a -> b) -> m a -> m b
- blankLine :: Vector ByteString -> Bool
- liftM2' :: Monad m => (a -> b -> c) -> m a -> m b -> m c
- endOfLine :: Parser ()
- doubleQuote :: Word8
- newline :: Word8
- cr :: Word8
- comma :: Word8
Documentation
encodeRow :: Vector (Escaped ByteString) -> ByteString Source
escape :: ByteString -> Escaped ByteString Source
escapeAlways :: ByteString -> Escaped ByteString Source
This implementation is definitely suboptimal. A better option (which would waste a little space but would be much faster) would be to build the new bytestring by writing to a buffer directly.
Specialized version of sepBy1'
which is faster due to not
accepting an arbitrary separator.
sepByEndOfLine1' :: Parser a -> Parser [a] Source
Specialized version of sepBy1'
which is faster due to not
accepting an arbitrary separator.
:: Word8 | Field delimiter |
-> Parser (Vector ByteString) |
Parse a record, not including the terminating line separator. The
terminating line separate is not included as the last record in a
CSV file is allowed to not have a terminating line separator. You
most likely want to use the endOfLine
parser in combination with
this parser.
:: Word8 | Field delimiter |
-> Parser (Vector ByteString) |
removeBlankLines :: [Vector ByteString] -> [Vector ByteString] Source
field :: Word8 -> Parser ByteString Source
Parse a field. The field may be in either the escaped or non-escaped format. The return value is unescaped.
unescape :: Parser ByteString Source
This could be improved. We could avoid the builder and just write to a buffer directly.
blankLine :: Vector ByteString -> Bool Source
Is this an empty record (i.e. a blank line)?
liftM2' :: Monad m => (a -> b -> c) -> m a -> m b -> m c Source
A version of liftM2
that is strict in the result of its first
action.
Match either a single newline character '\n'
, or a carriage
return followed by a newline character "\r\n"
, or a single
carriage return '\r'
.