hasklepias: Define features from events

[ bsd3, data-science, deprecated, library, program ] [ Propose Tags ]
Deprecated
Versions [RSS] 0.4.2, 0.4.3, 0.4.4, 0.5.0, 0.6.0, 0.6.1, 0.7.0, 0.7.1, 0.8.3, 0.12.0, 0.13.0, 0.13.1, 0.14.0, 0.15.0, 0.15.1, 0.15.2, 0.16.0, 0.16.1, 0.17.0, 0.17.1, 0.18.0, 0.20.0
Change log ChangeLog.md
Dependencies aeson (>=1.4.0.0 && <2), base (>=4.14 && <4.15), bytestring (>=0.10), containers (>=0.6.0), flow (==1.0.22), interval-algebra (==0.8.2), QuickCheck, safe (>=0.3), text (>=1.2.3), time (>=1.11), unordered-containers (>=0.2.10), vector (>=0.12), witherable (>=0.4) [details]
License BSD-3-Clause
Copyright NoviSci, Inc
Author Bradley Saul
Maintainer bsaul@novisci.com
Category Data Science
Home page https://github.com/novisci/asclepias/#readme
Bug tracker https://github.com/novisci/asclepias/issues
Source repo head: git clone https://github.com/novisci/asclepias
Uploaded by bradleysaul at 2021-05-24T17:18:10Z
Distributions
Downloads 3808 total (46 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2021-05-24 [all 1 reports]

Readme for hasklepias-0.6.0

[back to package description]

Project Asclepias

Asclepias (n):

  1. The genus of North American milkweeds, named after Linnaeus after the greek god of healing, Asclepius.
  2. A language and software project for defining and deriving features from temporally ordered events using the interval algebra.

Current status

The initial versions of hasklepias will focus on the ability to derive features from a sorted collection of events. At this time, developers can experiment with feature definitions (see the examples directory).

Getting started

The official implementation of Asclepias is the embedded domain specific language (eDSL) provided by the hasklepias Haskell library. To get started then, you'll need to install the Haskell toolchain, especially the Glasgow Haskell Compiler (GHC) and the building and packaging system cabal, for which you can use the ghcup utility.

You can use any development environment you chose, but for maximum coding pleasure, it is highly recommended that you install the Haskell language server (hsl). This can be installed using ghcup or some integrated development environments, such as Visual Studio Code, have excellent hsl integration.

Defining features

At this time, hasklepias can be used for experimenting with Feature definitions. A Feature d is currently a wrapper of an Either type:

type Feature d = Feature { getFeatureData :: Either MissingReason d }

The Either type means there are two possibilities for the type of a Feature. The Left can be a MissingReason, which is a sum type enumerating the reasons that the data is missing:

data MissingReason = -- this list may grow/change in the future
    InsufficientData
  | Excluded
  | Other String
  | Unknown

The Right has the type d, meaning it can be any type you choose. In the moduleExampleFeatures1, the index feature has type Feature (Interval a). The (Right) type of index is an Interval a, where again a can be any type you chose, subject to the constraints of intervals. The hasDuckHistory feature has the type Feature (Bool, Maybe (Interval a), where the Bool is used an indicator of a history with ducks and the Maybe (Interval a) is the Interval a of the last encounter with a duck if it exists. The countOfHospitalEvents feature has the type Feature (Int, Maybe b) where the Int is the count of hospital visits and Maybe b is the duration of the last visit if one exists. These examples show how the data (or shape) of a Feature can be defined as Interval a, (Bool, Maybe (Interval a)), or (Int, Maybe b). In fact, as long as the data is derivable from other Features and/or a list of Events, you can shape a Feature however you'd like!

Interactive use/development

To run the examples interactively, open a ghci session with:

cabal repl hasklepias:examples --repl-options -itest

The option flag --repl-options -itest allows to make changes to the files in the examples folder and reload with :reload (or :r) without exiting the ghci session. Developers working on src files can add the --repl-options -isrc option flag to make changes to src files too.

In ghci you have access to all exposed functions in hasklepias, interval-algebra, and those in the examples folders. For example, exampleEvents1 is a list of events used to check some of the example features, which you can interact with:

*Main> headMay exampleEvents1
Just {(1, 10), Context {getConcepts = ["enrollment"], getFacts = Nothing, getSource = Nothing}}
*Main> length exampleEvents1
24
*Main> combineIntervals $ intervals exampleEvents1
[(1, 10),(11, 20),(21, 30),(31, 40),(45, 100)]
*Main> mapM_ print exampleEvents1
{(1, 10), Context {getConcepts = fromList ["enrollment"], getFacts = Nothing, getSource = Nothing}}
{(2, 3), Context {getConcepts = fromList ["wasScratchedByCat"], getFacts = Nothing, getSource = Nothing}}
{(5, 6), Context {getConcepts = fromList ["hadMinorSurgery"], getFacts = Nothing, getSource = Nothing}}
{(5, 10), Context {getConcepts = fromList ["tookAntibiotics"], getFacts = Nothing, getSource = Nothing}}
{(11, 20), Context {getConcepts = fromList ["enrollment"], getFacts = Nothing, getSource = Nothing}}
{(21, 30), Context {getConcepts = fromList ["enrollment"], getFacts = Nothing, getSource = Nothing}}
{(31, 40), Context {getConcepts = fromList ["enrollment"], getFacts = Nothing, getSource = Nothing}}
{(45, 46), Context {getConcepts = fromList ["wasStruckByDuck"], getFacts = Nothing, getSource = Nothing}}
<<<result truncated>>>