record-encode: Generic encoding of records

[ bsd3, data, data-mining, data-science, deprecated, library, machine-learning ]

Generic encoding of records. It currently provides a single, polymorphic function to encode sum types (i.e. categorical variables) as one-hot vectors.

Versions [faq],, 0.2, 0.2.1, 0.2.2, 0.2.3
Dependencies base (>=4.7 && <5), generics-sop, vector [details]
License BSD-3-Clause
Copyright 2018 Marco Zocca
Author Marco Zocca
Maintainer ocramz fripost org
Category Data, Data Science, Data Mining, Machine Learning
Source repo head: git clone
Uploaded by ocramz at Wed Jan 23 21:08:29 UTC 2019
Distributions NixOS:0.2.3
Readme for record-encode-0.2.3

Encoding categorical variables

Build Status Hackage

This library provides generic machinery to encode values of some algebraic type as points in a vector space.

Values of a sum type (e.g. enumerations) are also called "categorical" variables in statistics, because they encode a choice between a number of discrete categories.

On the other hand, many data science / machine learning algorithms rely on a purely numerical representation of data; the conversion code from values of a static type is often "boilerplate", i.e. largely repeated and not informative.

The encodeOneHot function provided here is a generic utility function (i.e. defined once and for all) to compute the one-hot representation of any sum type.

Usage example

    {-# language DeriveGeneric -#}

    import qualified GHC.Generics as G
    import qualified Generics.SOP as SOP
    import Data.Record.Encode

    data X = A | B | C deriving (G.Generic)
    instance SOP.Generic X
    > encodeOneHot B
    OH {oDim = 3, oIx = 1}

Please refer to the documentation of Data.Record.Encode for more examples and details.


Gagandeep Bhatia (@gagandeepb) for his Google Summer of Code 2018 work on Frames-beam, Mark Karpov (@mrkkrp) for his Template Haskell tutorial, Anthony Cowley (@acowley) for Frames, @mniip on Freenode #haskell for helping me better understand what can be done with generic programming.