Safe Haskell | None |
---|---|
Language | Haskell2010 |
Avro Schema
s, represented here as values of type Schema
,
describe the serialization and de-serialization of values.
In Avro schemas are compose-able such that encoding data under a schema and
decoding with a variant, such as newer or older version of the original
schema, can be accomplished by using the Deconflict
module.
Synopsis
- data Schema where
- Null
- Boolean
- Int { }
- Long { }
- Float
- Double
- Bytes { }
- String { }
- Array { }
- Map { }
- NamedType TypeName
- Record { }
- Enum { }
- Union { }
- Fixed {
- name :: TypeName
- aliases :: [TypeName]
- size :: Int
- logicalTypeF :: Maybe LogicalTypeFixed
- pattern Int' :: Schema
- pattern Long' :: Schema
- pattern Bytes' :: Schema
- pattern String' :: Schema
- data DefaultValue
- = DNull
- | DBoolean !Bool
- | DInt Schema Int32
- | DLong Schema Int64
- | DFloat Schema Float
- | DDouble Schema Double
- | DBytes Schema ByteString
- | DString Schema Text
- | DArray (Vector DefaultValue)
- | DMap (HashMap Text DefaultValue)
- | DRecord Schema (HashMap Text DefaultValue)
- | DUnion (Vector Schema) Schema DefaultValue
- | DFixed Schema !ByteString
- | DEnum Schema Int Text
- data Field = Field {
- fldName :: Text
- fldAliases :: [Text]
- fldDoc :: Maybe Text
- fldOrder :: Maybe Order
- fldType :: Schema
- fldDefault :: Maybe DefaultValue
- data Order
- data TypeName = TN {}
- data Decimal = Decimal {}
- newtype LogicalTypeBytes = DecimalB Decimal
- data LogicalTypeFixed
- data LogicalTypeInt
- data LogicalTypeLong
- data LogicalTypeString = UUID
- renderFullname :: TypeName -> Text
- parseFullname :: Text -> TypeName
- mkEnum :: TypeName -> [TypeName] -> Maybe Text -> [Text] -> Schema
- mkUnion :: NonEmpty Schema -> Schema
- validateSchema :: Schema -> Parser ()
- typeName :: Schema -> Text
- buildTypeEnvironment :: Applicative m => (TypeName -> m Schema) -> Schema -> TypeName -> m Schema
- extractBindings :: Schema -> HashMap TypeName Schema
- data Result a
- badValue :: Show t => t -> String -> Result a
- resultToEither :: Result b -> Either String b
- matches :: Schema -> Schema -> Bool
- parseBytes :: Text -> Result ByteString
- serializeBytes :: ByteString -> Text
- parseAvroJSON :: (Schema -> Value -> Result DefaultValue) -> (TypeName -> Maybe Schema) -> Schema -> Value -> Result DefaultValue
- overlay :: Schema -> Schema -> Schema
- subdefinition :: Schema -> Text -> Maybe Schema
- expandNamedTypes :: Schema -> Schema
Schema description types
N.B. It is possible to create a Haskell value (of Schema
type) that is
not a valid Avro schema by violating one of the above or one of the
conditions called out in validateSchema
.
Null | |
Boolean | |
Int | |
Long | |
Float | |
Double | |
Bytes | |
String | |
Array | |
Map | |
NamedType TypeName | |
Record | |
Enum | |
Union | |
Fixed | |
|
Instances
data DefaultValue Source #
DNull | |
DBoolean !Bool | |
DInt Schema Int32 | |
DLong Schema Int64 | |
DFloat Schema Float | |
DDouble Schema Double | |
DBytes Schema ByteString | |
DString Schema Text | |
DArray (Vector DefaultValue) | Dynamically enforced monomorphic type. |
DMap (HashMap Text DefaultValue) | Dynamically enforced monomorphic type |
DRecord Schema (HashMap Text DefaultValue) | |
DUnion (Vector Schema) Schema DefaultValue | Set of union options, schema for selected option, and the actual value. |
DFixed Schema !ByteString | |
DEnum Schema Int Text | An enum is a set of the possible symbols (the schema) and the selected symbol |
Instances
Field | |
|
Instances
Instances
Eq Order Source # | |
Ord Order Source # | |
Show Order Source # | |
Generic Order Source # | |
ToJSON Order Source # | |
Defined in Data.Avro.Schema.Schema | |
FromJSON Order Source # | |
NFData Order Source # | |
Defined in Data.Avro.Schema.Schema | |
Lift Order Source # | |
type Rep Order Source # | |
Defined in Data.Avro.Schema.Schema type Rep Order = D1 ('MetaData "Order" "Data.Avro.Schema.Schema" "avro-0.6.1.1-94Hp1TbzBg2Dps8RxrGyIK" 'False) (C1 ('MetaCons "Ascending" 'PrefixI 'False) (U1 :: Type -> Type) :+: (C1 ('MetaCons "Descending" 'PrefixI 'False) (U1 :: Type -> Type) :+: C1 ('MetaCons "Ignore" 'PrefixI 'False) (U1 :: Type -> Type))) |
A named type in Avro has a name and, optionally, a namespace.
A name is a string that starts with an ASCII letter or underscore followed by letters, underscores and digits:
name ::= [A-Za-z_][A-Za-z0-9_]*
Examples include "_foo7"
, Bar_
and "x"
.
A namespace is a sequence of names with the same lexical
structure. When written as a string, the components of a namespace
are separated with dots ("com.example"
).
TypeName
represents a fullname—a name combined with a
namespace. These are written and parsed as dot-separated
strings. The TypeName
TN Foo ["com", "example"]
is rendered
as "com.example.Foo"
.
Fullnames have to be globally unique inside an Avro schema.
A namespace of []
or [""]
is the "null namespace". In avro
an explicitly null-namespaced identifier is written as ".Foo"
Instances
The decimal
logical type represents arbitrary-precision decimal
numbers. Numbers are represented as unscaled * (10 ** -scale)
where scale
is part of the logical type and unscaled
is an
integer represented by the underlying primitive type.
Instances of the decimal
logical type need to specify a scale
and precision
.
decimal
can be encoded as one of several different primitive
types:
bytes
fixed
long
int
For long
and int
, unscaled
is the underlying number.
For bytes
and fixed
, unscaled
is represented as a
two's-complement signed integer in big-endian byte order.
Note: int
and long
representations for decimal
are not part
of the current Avro
specification,
but they are supported by some language implementations including
the official Java library. Implementations that do not support this
should ignore the logical type and use the underlying primitive
type instead.
Instances
Eq Decimal Source # | |
Ord Decimal Source # | |
Show Decimal Source # | |
Generic Decimal Source # | |
NFData Decimal Source # | |
Defined in Data.Avro.Schema.Schema | |
Lift Decimal Source # | |
type Rep Decimal Source # | |
Defined in Data.Avro.Schema.Schema type Rep Decimal = D1 ('MetaData "Decimal" "Data.Avro.Schema.Schema" "avro-0.6.1.1-94Hp1TbzBg2Dps8RxrGyIK" 'False) (C1 ('MetaCons "Decimal" 'PrefixI 'True) (S1 ('MetaSel ('Just "precision") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 Integer) :*: S1 ('MetaSel ('Just "scale") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedStrict) (Rec0 Integer))) |
newtype LogicalTypeBytes Source #
Instances
data LogicalTypeFixed Source #
DecimalF Decimal | An arbitrary-precision signed decimal number. See |
Duration | An interval of time, represented as some number of months, days and milliseconds. Encoded as three little-endian unsigned integers for months, days and milliseconds respectively. |
Instances
data LogicalTypeInt Source #
DecimalI Decimal | An arbitrary-precision signed decimal number. See |
Date | A date (eg Encoded as the number of days before/after the Unix epoch (1970-01-01). |
TimeMillis | A time of day with millisecond precision. Encoded as the number of milliseconds after midnight. |
Instances
data LogicalTypeLong Source #
DecimalL Decimal | An arbitrary-precision signed decimal number. See |
TimeMicros | A time of day with microsecond precision. Encoded as the number of microseconds after midnight. |
TimestampMillis | A UTC timestamp with millisecond precision. Encoded as the number of milliseconds before/after the Unix epoch (1970-01-01 00:00:00.000). |
TimestampMicros | A UTC timestamp with microsecond precision. Encoded as the number of microseconds before/after the Unix epoch (1970-01-01 00:00:00.000000). |
LocalTimestampMillis | A timestamp in the local timezone, whatever that happens to be, with millisecond precision. Encoded as the number of milliseconds before/after the Unix epoch (1970-01-01 00:00:00.000). |
LocalTimestampMicros | A timestamp in the local timezone, whatever that happens to be, with microsecond precision. Encoded as the number of microseconds before/after the Unix epoch (1970-01-01 00:00:00.000000). |
Instances
data LogicalTypeString Source #
Instances
renderFullname :: TypeName -> Text Source #
parseFullname :: Text -> TypeName Source #
:: TypeName | The name of the enum (includes namespace). |
-> [TypeName] | Aliases for the enum (if any). |
-> Maybe Text | Optional documentation for the enum. |
-> [Text] | The symbols of the enum. |
-> Schema |
Build an Enum
value from its components.
mkUnion :: NonEmpty Schema -> Schema Source #
mkUnion subTypes
Defines a union of the provided subTypes. N.B. it is
invalid Avro to include another union or to have more than one of the same
type as a direct member of the union. No check is done for this condition!
validateSchema :: Schema -> Parser () Source #
Placeholder NO-OP function!
Validates a schema to ensure:
- All types are defined
- Unions do not directly contain other unions
- Unions are not ambiguous (may not contain more than one schema with the same type except for named types of record, fixed and enum)
- Default values for unions can be cast as the type indicated by the first structure.
- Default values can be cast/de-serialize correctly.
- Named types are resolvable
Lower level utilities
typeName :: Schema -> Text Source #
Get the name of the type. In the case of unions, get the name of the first value in the union schema.
:: Applicative m | |
=> (TypeName -> m Schema) | Callback to handle type names not in the schema. |
-> Schema | The schema that we're generating a lookup function for. |
-> TypeName -> m Schema |
buildTypeEnvironment schema
builds a function mapping type names to
the types declared in the traversed schema.
This mapping includes both the base type names and any aliases they have. Aliases and normal names are not differentiated in any way.
extractBindings :: Schema -> HashMap TypeName Schema Source #
extractBindings schema
traverses a schema and builds a map of all declared
types.
Types declared implicitly in record field definitions are also included. No distinction is made between aliases and normal names.
Instances
matches :: Schema -> Schema -> Bool Source #
Checks that two schemas match. This is like equality of schemas,
except NamedTypes
match against other types with the same name.
This extends recursively: two records match if they have the same name, the same number of fields and the fields all match.
parseBytes :: Text -> Result ByteString Source #
Parses a string literal into a bytestring in the format expected for bytes and fixed values. Will fail if every character does not have a codepoint between 0 and 255.
serializeBytes :: ByteString -> Text Source #
Turn a ByteString
into a Text
that matches the format Avro
expects from bytes and fixed literals in JSON. Each byte is mapped
to a single Unicode codepoint between 0 and 255.
:: (Schema -> Value -> Result DefaultValue) | How to handle unions. The way unions are formatted in JSON depends on whether we're parsing a normal Avro object or we're parsing a default declaration in a schema. This function will only ever be passed |
-> (TypeName -> Maybe Schema) | |
-> Schema | |
-> Value | |
-> Result DefaultValue |
Parse JSON-encoded avro data.
overlay :: Schema -> Schema -> Schema Source #
Merge two schemas to produce a third.
Specifically, overlay schema reference
fills in NamedTypes
in schema
using any matching definitions from reference
.
subdefinition :: Schema -> Text -> Maybe Schema Source #
Extract the named inner type definition as its own schema.
expandNamedTypes :: Schema -> Schema Source #