flat-0.4.4: Principled and efficient bit-oriented binary serialization.

Safe HaskellSafe
LanguageHaskell2010

Flat.Tutorial

Synopsis

    Documentation

    To (de)serialise a data type, make it an instance of the Flat class.

    There is Generics based support to automatically derive a correct instance.

    Let’s see some code.

    We need a couple of extensions:

    >>> :set -XDeriveGeneric -XDeriveAnyClass
    

    The Flat top module:

    >>> import Flat
    

    And, just for fun, a couple of functions to display an encoded value as a sequence of bits:

    >>> import Flat.Instances.Test (flatBits,allBits)
    

    Define a few custom data types, deriving Generic and Flat:

    >>> data Result = Bad | Good deriving (Show,Generic,Flat)
    
    >>> data Direction = North | South | Center | East | West deriving (Show,Generic,Flat)
    
    >>> data List a = Nil | Cons a (List a) deriving (Show,Generic,Flat)
    

    Now we can encode a List of Directions using flat:

    >>> flat $ Cons North (Cons South Nil)
    "\149"
    

    The result is a strict ByteString.

    And decode it back using unflat:

    >>> unflat . flat $ Cons North (Cons South Nil) :: Decoded (List Direction)
    Right (Cons North (Cons South Nil))
    

    The result is a Decoded value: Either a DecodeException or the actual value.

    Optimal Bit-Encoding

    A pecularity of Flat is that it uses an optimal bit-encoding rather than the usual byte-oriented one.

    One bit is all we need for a Result or for an empty List value:

    >>> flatBits Good
    "1"
    
    >>> flatBits (Nil::List Direction)
    "0"
    

    Two or three bits suffice for a Direction value:

    >>> flatBits South
    "01"
    
    >>> flatBits West
    "111"
    

    For the serialisation to work with byte-oriented devices or storage, we need to add some padding.

    To do so, rather than encoding a plain value, flat encodes a PostAligned value, that's to say a value followed by a Filler that stretches till the next byte boundary.

    In practice, the padding is a, possibly empty, sequence of 0s followed by a 1.

    For example, this list encodes as 7 bits:

    >>> flatBits $ Cons North (Cons South Nil)
    "1001010"
    

    And, with the added padding of a final "1", will snugly fit in a single byte:

    >>> allBits $ Cons North (Cons South Nil)
    "10010101"
    

    But .. you don't need to worry about these details as byte-padding is automatically added by the function flat and removed by unflat.

    Pre-defined Instances

    Flat instances are already defined for relevant types of some common packages: array, base, bytestring, containers, dlist, mono-traversable, text, unordered-containers, vector.

    They are automatically imported by the Flat module.

    For example:

    >>> flatBits $ Just True
    "11"
    

    Compatibility

    GHC

    GHCJS

    • ghcjs-8.4.0.1.

    NOTE: Some tests are not run for ghcjs as they are related to unsupported features such as UTF16 encoding of Text and short ByteString.

    For details of what tests are skipped search test/Spec.hs for ghcjs_HOST_OS.

    NOTE: Some older versions of ghcjs and versions of flat prior to 0.33 encoded Double values incorrectly when not aligned with a byte boundary.

    ETA

    It builds (with etlas 1.5.0.0 and eta-0.8.6b2) and passes the doctest-static test but it won't complete the main spec test probably because of a recursive iteration issue, see https://github.com/typelead/eta/issues/901.

    Support for eta is not currently being actively mantained.

    Known Bugs and Infelicities

    Longish compilation times

    Relies more than other serialisation libraries on extensive inlining for its good performance, this unfortunately leads to longer compilation times.

    If you have many data types or very large ones this might become an issue.

    A couple of good practices that will eliminate or mitigate this problem are:

    • During development, turn optimisations off (stack --fast or -O0 in the cabal file).
    • Keep your serialisation code in a separate module(s).

    Data types with more than 512 constructors are currently unsupported

    This limit could be easily extended, shout if you need it.

    Other

    Full list of open issues.

    Acknowledgements

    flat reuses ideas and readapts code from various packages, mainly: store, binary-bits and binary and includes contributions from Justus Sagemüller.