repa-convert-4.2.0.1: Packing and unpacking flat tables.

Safe HaskellNone
LanguageHaskell98

Data.Repa.Convert

Contents

Description

Convert tuples of Haskell values to and from ASCII or packed binary representations.

This package is intended for cheap and cheerful serialisation and deserialisation of flat tables, where each row has a fixed format. If you have a table consisting of a couple hundred megs of Pipe-Separated-Variables issued by some filthy enterprise system, then this package is for you.

If you want to parse context-free, or context-sensitive languages then try the parsec or attoparsec packages. If you have binary data that does not have a fixed format then try the binary or cereal packages.

For testing purposes, use packToString which takes a format, a record, and returns a list of bytes.

> import Data.Repa.Convert

> let format   = Sep '|' (VarAsc :*: IntAsc :*: DoubleAsc :*: ())
> let Just str = packToString format ("foo" :*: 66 :*: 93.42 :*: ())
> str
"foo|66|93.42"

We can then unpack the raw bytes back to Haskell values with unpackFromString.

> unpackFromString format str 
Just ("foo" :*: (66 :*: (93.42 :*: ())))

In production code use unsafeRunPacker and unsafeRunUnpacker to work directly with a buffer in foreign memory.

  • NOTE that in the current version the separating character is un-escapable.
  • The above means that the format (Sep ',') does NOT parse a CSV file according to the CSV specification: http://tools.ietf.org/html/rfc4180.

Synopsis

Documentation

The Formats module contains the pre-defined data formats.

Data formats

class Format f where Source

Relates a storage format to the Haskell type of the value that is stored in that format.

Associated Types

type Value f Source

Get the type of a value with this format.

Methods

fieldCount :: f -> Int Source

Yield the number of separate fields in this format.

minSize :: f -> Int Source

Yield the minumum number of bytes that a value of this format will take up.

Packing a value into this format is guaranteed to use at least this many bytes. This is exact for fixed-size formats.

fixedSize :: f -> Maybe Int Source

For fixed size formats, yield their size (length) in bytes.

Yields Nothing if this is not a fixed size format.

packedSize :: f -> Value f -> Maybe Int Source

Yield the size of a value in the given format.

Yields Nothing when a collection of values is to be packed into a fixed length format, but the size of the collection does not match the format.

If fixedSize returns a size then packedSize returns the same size.

Type constraints

forFormat :: format -> Value format -> Value format Source

Constrain the type of a value to match the given format.

The value itself is not used.

listFormat :: format -> [Value format] -> [Value format] Source

Constrain the type of some values to match the given format.

The value itself is not used.

Packing data

class Format format => Packable format where Source

Class of storage formats that can have values packed and unpacked from foreign bufferes.

The methods are written using continuations to make it easier for GHC to optimise its core code when packing/unpacking many fields.

Methods

pack Source

Arguments

:: format

Storage format.

-> Value format

Value to pack.

-> Packer

Packer that can write the value.

Pack a value into a buffer using the given format.

unpack Source

Arguments

:: format

Storage format.

-> Unpacker (Value format)

Unpacker for that format.

Unpack a value from a buffer using the given format.

Packer monoid

data Packer Source

Packer wraps a function that can write to a buffer.

Constructors

Packer 

Fields

fromPacker :: Ptr Word8 -> (Ptr Word8 -> IO (Maybe (Ptr Word8))) -> IO (Maybe (Ptr Word8))

Takes start of buffer, packs data into it, and calls the continuation with a pointer to the byte just after the last one that was written.

Instances

unsafeRunPacker Source

Arguments

:: Packer

Packer to run.

-> Ptr Word8

Start of buffer.

-> IO (Maybe (Ptr Word8))

Pointer to the byte after the last one written.

Pack data into the given buffer.

PRECONDITION: The buffer needs to be big enough to hold the packed data, otherwise you'll corrupt the heap (bad). Use packedSize to work out how big it needs to be.

Unpacker monad

data Unpacker a Source

Constructors

Unpacker 

Fields

fromUnpacker :: forall b. Ptr Word8 -> Ptr Word8 -> (Word8 -> Bool) -> IO b -> (Ptr Word8 -> a -> IO b) -> IO b

Takes pointers to the first byte in the buffer, the first byte after the buffer, and a special field terminating character. The field terminating character is used by variable length encodings where the length of the encoded data cannot be determined from the encoding itself.

If a value can be successfully unpacked from the buffer then it is passed to the continuation, along with a pointer to the byte after the last one that was read. If not, then the fail action is invoked.

unsafeRunUnpacker Source

Arguments

:: Unpacker a

Unpacker to run.

-> Ptr Word8

Source buffer.

-> Int

Length of source buffer.

-> (Word8 -> Bool)

Detect a field terminator.

-> IO (Maybe (a, Ptr Word8))

Unpacked result, and pointer to the byte after the last one read.

Unpack data from the given buffer.

PRECONDITION: The buffer must be at least the minimum size of the format (minSize). This allows us to avoid repeatedly checking for buffer overrun when unpacking fixed size format. If the buffer is not long enough then you'll get an indeterminate result (bad).

Interfaces

Default Ascii

packToAscii :: (FormatAscii a, Value (FormatAscii' a) ~ a, Packable (FormatAscii' a)) => a -> Maybe String Source

Pack a value to a list of Word8 using the default Ascii format.

List Interface

packToList8 :: Packable format => format -> Value format -> Maybe [Word8] Source

Pack a value to a list of Word8.

unpackFromList8 :: Packable format => format -> [Word8] -> Maybe (Value format) Source

Unpack a value from a list of Word8.

String Interface

packToString :: Packable format => format -> Value format -> Maybe String Source

Pack a value to a String.

unpackFromString :: Packable format => format -> String -> Maybe (Value format) Source

Unpack a value from a String.