avro: Avro serialization support for Haskell

[ bsd3, data, library ] [ Propose Tags ]

Avro serialization and deserialization support for Haskell


[Skip to Readme]

Flags

Manual Flags

NameDescriptionDefault
dev

Use development GHC flags

Disabled
Automatic Flags
NameDescriptionDefault
templatehaskell

Build Avro.Deriving, which uses Template Haskell.

Enabled

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

Versions [RSS] 0.1.0.0, 0.1.0.1, 0.2.0.0, 0.2.1.0, 0.2.1.1, 0.3.0.0, 0.3.0.1, 0.3.0.2, 0.3.0.3, 0.3.0.4, 0.3.0.5, 0.3.1.0, 0.3.1.1, 0.3.1.2, 0.3.2.0, 0.3.3.0, 0.3.3.1, 0.3.4.0, 0.3.4.1, 0.3.4.2, 0.3.4.3, 0.3.5.0, 0.3.5.1, 0.3.5.2, 0.3.6.0, 0.3.6.1, 0.4.0.0, 0.4.1.0, 0.4.1.1, 0.4.1.2, 0.4.2.0, 0.4.3.0, 0.4.4.0, 0.4.4.1, 0.4.4.2, 0.4.4.3, 0.4.4.4, 0.4.5.0, 0.4.5.1, 0.4.5.2, 0.4.5.3, 0.4.5.4, 0.4.6.0, 0.4.7.0, 0.5.0.0, 0.5.1.0, 0.5.2.0, 0.5.2.1, 0.6.0.0, 0.6.0.1, 0.6.0.2, 0.6.1.0, 0.6.1.1, 0.6.1.2 (info)
Change log ChangeLog.md
Dependencies aeson, array, base (>=4.8 && <5.0), base16-bytestring, bifunctors, binary, bytestring, containers, data-binary-ieee754, deepseq, fail, hashable, mtl, scientific, semigroups, tagged, template-haskell (>=2.4), text (<1.2.4), tf-random, unordered-containers, vector, zlib [details]
License BSD-3-Clause
Author Thomas M. DuBuisson
Maintainer Alexey Raga <alexey.raga@gmail.com>
Revised Revision 1 made by haskellworks at 2019-08-12T01:23:50Z
Category Data
Home page https://github.com/haskell-works/avro#readme
Bug tracker https://github.com/haskell-works/avro/issues
Source repo head: git clone https://github.com/haskell-works/avro
Uploaded by haskellworks at 2019-08-01T11:23:43Z
Distributions LTSHaskell:0.6.1.2
Reverse Dependencies 11 direct, 1 indirect [details]
Downloads 29076 total (81 in the last 30 days)
Rating 2.25 (votes: 2) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2019-08-01 [all 1 reports]

Readme for avro-0.4.5.1

[back to package description]

Native Haskell implementation of Avro

This is a Haskell Avro library useful for decoding and encoding Avro data structures. Avro can be thought of as a serialization format and RPC specification which induces three separable tasks:

  • Serialization/Deserialization - This library has been used "in anger" for:
    • Deserialization of avro container files
    • Serialization/deserialization Avro messages to/from Kafka topics
  • RPC - There is currently no support for Avro RPC in this library.

This library also provides functionality for automatically generating Avro-related data types and instances from Avro schemas (using TemplateHaskell).

Quickstart

This library provides the following conversions between Haskell types and Avro types:

Haskell type Avro type
() "null"
Bool "boolean"
Int, Int64 "long"
Int32 "int"
Double "double"
Text "string"
ByteString "bytes"
Maybe a ["null", "a"]
Either a b ["a", "b"]
Map Text a {"type": "map", "value": "a"}
Map String a {"type": "map", "value": "a"}
HashMap Text a {"type": "map", "value": "a"}
HashMap String a {"type": "map", "value": "a"}
[a] {"type": "array", "value": "a"}

User defined data types should provide HasAvroSchema/ToAvro/FromAvro instances to be encoded/decoded to/from Avro.

Defining types and HasAvroSchema / FromAvro / ToAvro manually

Typically these imports are useful:

import           Data.Avro
import           Data.Avro.Schema as S
import qualified Data.Avro.Types  as AT

Assuming there is a data type to be encoded/decoded from/to Avro:

data Gender = Male | Female deriving (Eq, Ord, Show, Enum)
data Person = Person
     { fullName :: Text
     , age      :: Int32
     , gender   :: Gender
     , ssn      :: Maybe Text
     } deriving (Show, Eq)

Avro schema for this type can be defined as:

genderSchema :: Schema
genderSchema = mkEnum "Gender" [] Nothing Nothing ["Male", "Female"]

personSchema :: Schema
personSchema =
  Record "Person" Nothing [] Nothing Nothing
    [ fld "name"   String       Nothing
    , fld "age"    Int          Nothing
    , fld "gender" genderSchema Nothing
    , fld "ssn" (mkUnion $ Null :| [String]) Nothing
    ]
    where
     fld nm ty def = Field nm [] Nothing Nothing ty def

instance HasAvroSchema Person where
  schema = pure personSchema

ToAvro instance for Person can be defined as:

instance ToAvro Person where
  schema = pure personSchema
  toAvro p = record personSchema
             [ "name"   .= fullName p
             , "age"    .= age p
             , "gender" .= gender p
             , "ssn"    .= ssn p
             ]

FromAvro instance for Person can be defined as:

instance FromAvro Person where
  fromAvro (AT.Record _ r) =
    Person <$> r .: "name"
           <*> r .: "age"
           <*> r .: "gender"
           <*> r .: "ssn"
  fromAvro r = badValue r "Person"

Defining types and HasAvroSchema / FromAvro / ToAvro "automatically"

This library provides functionality to derive Haskell data types and HasAvroSchema/FromAvro/ToAvro instances "automatically" from already existing Avro schemas (using TemplateHaskell).

Examples

deriveAvro will derive data types, FromAvro and ToAvro instances from a provided Avro schema file:

{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE DeriveGeneric   #-}
import Data.Avro.Deriving

deriveAvro "schemas/contract.avsc"

Similarly, deriveFromAvro can be used to only derive data types and FromAvro, but not ToAvro instances.

If you prefer defining Avro schema in Haskell and not in avsc, then deriveAvro' can be used instead of deriveAvro.

Conventions

When Haskell data types are generated, these conventions are followed:

  • Type and field names are "sanitized": all the charachers except [a-z,A-Z,',_] are removed from names
  • Field names are prefixed with the name of the record they are declared in.

For example, if Avro schema defines Person record as:

{ "type": "record",
  "name": "Person",
  "fields": [
    { "name": "name", "type": "string"}
  ]
}

then generated Haskell type will look like:

data Person = Person
     { personName :: Text
     } deriving (Show, Eq)

Limitations

Two-parts unions like ["null", "MyType"] or ["MyType", "YourType"] are supported (as Haskell's Maybe MyType and Either MyType YourType), but multi-parts unions are currently not supported. It is not due to any fundamental problems but because it has not been done yet. PRs are welcomed! :)

TODO

Please see the TODO