record-encode-0.2.1: Generic encoding of records

Safe HaskellNone
LanguageHaskell2010

Data.Record.Encode

Contents

Description

This library provides generic machinery (via GHC.Generics and `generics-sop`) to encode values of some algebraic type as points in a vector space.

Processing datasets that have one or more categorical variables (which in other words are values of a sum type) typically requires a series of boilerplate transformations, and the encodeOneHot function provided here does precisely that.

Internals

This library makes use of generic programming to analyze both values and types (see the internal Data.Record.Encode.Generics module).

Initially, it was relying on Template Haskell to analyze types, using the the instance generation machinery explained here: https://markkarpov.com/tutorial/th.html#example-1-instance-generation

Synopsis

One-hot encoding

encodeOneHot :: forall a. G a => a -> OneHot Source #

Computes the one-hot encoding of a value of a sum type.

The type of the input value must be an instance of Generic (from GHC.Generics) and of Generic (from the `generics-sop` library).

>>> :set -XDeriveGeneric
>>> import qualified GHC.Generics as G
>>> import qualified Generics.SOP as SOP
>>> import Data.Record.Encode
>>> data X = A | B | C deriving (G.Generic)
>>> instance SOP.Generic X
>>> encodeOneHot B
OH {oDim = 3, oIx = 1}

Types and Utilities

data OneHot Source #

A one-hot encoding is a d-dimensional vector having a single component equal to 1 and all others equal to 0.

Constructors

OH 

Fields

  • oDim :: !Int

    Dimension of ambient space (i.e. number of categories)

  • oIx :: !Int

    Index of nonzero entry

Instances
Eq OneHot Source # 
Instance details

Defined in Data.Record.Encode

Methods

(==) :: OneHot -> OneHot -> Bool #

(/=) :: OneHot -> OneHot -> Bool #

Show OneHot Source # 
Instance details

Defined in Data.Record.Encode

oneHotV :: Num a => OneHot -> Vector a Source #

Create a one-hot vector

Generics-related

type G a = (GVariants (Rep a), Generic a, Generic a) Source #

Constraints necessary to encodeOneHot a value.

NB: GVariants is an internal typeclass, and this constraint is automatically satisfied if the type is an instance of Generic