winery: Sustainable serialisation library

[ bsd3, codec, data, library, parsing, program, serialization ] [ Propose Tags ]

Please see the README on Github at https://github.com/fumieval/winery#readme


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

Versions [RSS] 0, 0.1, 0.1.1, 0.1.2, 0.2, 0.2.1, 0.3, 0.3.1, 1, 1.0.1, 1.1, 1.1.1, 1.1.2, 1.1.3, 1.2, 1.3, 1.3.1, 1.3.2, 1.4
Change log ChangeLog.md
Dependencies base (>=4.7 && <5), bytestring, containers, cpu, hashable, megaparsec, mtl, prettyprinter, prettyprinter-ansi-terminal, scientific, text, transformers, unordered-containers, vector, winery [details]
License BSD-3-Clause
Copyright Copyright (c) 2017 Fumiaki Kinoshita
Author Fumiaki Kinoshita
Maintainer fumiexcel@gmail.com
Category Data
Home page https://github.com/fumieval/winery#readme
Bug tracker https://github.com/fumieval/winery/issues
Source repo head: git clone https://github.com/fumieval/winery
Uploaded by FumiakiKinoshita at 2018-07-18T02:22:58Z
Distributions
Reverse Dependencies 3 direct, 0 indirect [details]
Executables winery
Downloads 8247 total (66 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2018-07-18 [all 1 reports]

Readme for winery-0.1.2

[back to package description]

winery

winery is a serialisation library for Haskell. It tries to achieve two goals: compact representation and perpetual inspectability.

The standard binary library has no way to inspect the serialised value without the original instance.

There's serialise, which is an alternative library based on CBOR. Every value has to be accompanied with tags, so it tends to be redundant for arrays of small values. Encoding records with field names is also redudant.

Interface

The interface is simple; serialise encodes a value with its schema, and deserialise decodes a ByteString using the schema in it.

class Serialise a

serialise :: Serialise a => a -> B.ByteString
deserialise :: Serialise a => B.ByteString -> Either String a

It's also possible to serialise schemata and data separately.

-- Note that 'Schema' is an instance of 'Serialise'
schema :: Serialise a => proxy a -> Schema
serialiseOnly :: Serialise a => a -> B.ByteString

getDecoder gives you a deserialiser.

getDecoder :: Serialise a => Schema -> Either StrategyError (ByteString -> a)

For user-defined datatypes, you can either define instances

instance Serialise Foo where
  schemaVia = gschemaViaRecord
  toEncoding = gtoEncodingRecord
  deserialiser = gdeserialiserRecord Nothing

for single-constructor records, or just

instance Serialise Foo

for any ADT. The former explicitly describes field names in the schema, and the latter does constructor names.

The schema

The definition of Schema is as follows:

data Schema = SSchema !Word8
  | SUnit
  | SBool
  | SChar
  | SWord8
  | SWord16
  | SWord32
  | SWord64
  | SInt8
  | SInt16
  | SInt32
  | SInt64
  | SInteger
  | SFloat
  | SDouble
  | SBytes
  | SText
  | SList !Schema
  | SArray !(VarInt Int) !Schema -- fixed size
  | SProduct [Schema]
  | SProductFixed [(VarInt Int, Schema)] -- fixed size
  | SRecord [(T.Text, Schema)]
  | SVariant [(T.Text, [Schema])]
  | SFix Schema -- ^ binds a fixpoint
  | SSelf !Word8 -- ^ @SSelf n@ refers to the n-th innermost fixpoint
  deriving (Show, Read, Eq, Generic)

The Serialise instance is derived by generics.

There are some special schemata:

  • SSchema n is a schema of schema. The winery library stores the concrete schema of Schema for each version, so it can deserialise data even if the schema changes.
  • SFix binds a fixpoint.
  • SSelf n refers to the n-th innermost fixpoint bound by SFix. This allows it to provide schemata for inductive datatypes.

Backward compatibility

If having default values for missing fields is sufficient, you can pass a default value to gdeserialiserRecord:

  deserialiser = gdeserialiserRecord $ Just $ Foo "" 42 0

You can also build a custom deserialiser using the Alternative instance and combinators such as extractField, extractConstructor, etc.

Pretty-printing

Term can be deserialised from any winery data. It can be pretty-printed using the Pretty instance:

{ bar: "hello"
, baz: 3.141592653589793
, foo: Just 42
}

You can use the winery command-line tool to inspect values.

$ winery '.[:10] | .first_name .last_name' benchmarks/data.winery
Shane Plett
Mata Snead
Levon Sammes
Irina Gourlay
Brooks Titlow
Antons Culleton
Regine Emerton
Starlin Laying
Orv Kempshall
Elizabeth Joseff
Cathee Eberz

Benchmark

data TestRec = TestRec
  { id_ :: !Int
  , first_name :: !Text
  , last_name :: !Text
  , email :: !Text
  , gender :: !Gender
  , num :: !Int
  , latitude :: !Double
  , longitude :: !Double
  } deriving (Show, Generic)

(De)serialisation of the datatype above using generic instances:

serialise/winery                         mean 658.6 μs  ( +- 45.04 μs  )
serialise/binary                         mean 1.056 ms  ( +- 58.95 μs  )
serialise/serialise                      mean 258.8 μs  ( +- 5.654 μs  )
deserialise/winery                       mean 706.4 μs  ( +- 52.41 μs  )
deserialise/binary                       mean 1.393 ms  ( +- 56.71 μs  )
deserialise/serialise                    mean 765.8 μs  ( +- 30.26 μs  )

Not bad, considering that binary and serialise don't encode field names.