datasets: Classical data sets for statistics and machine learning

[ data, data-mining, library, machine-learning, mit, statistics ] [ Propose Tags ] [ Report a vulnerability ]

Classical machine learning and statistics datasets from the UCI Machine Learning Repository and other sources.

The datasets package defines two different kinds of datasets:

  • small data sets which are directly (or indirectly with `file-embed`) embedded in the package as pure values and do not require network or IO to download the data set. This includes Iris, Anscombe and OldFaithful.

  • other data sets which need to be fetched over the network with Numeric.Datasets.getDataset and are cached in a local temporary directory.

import Numeric.Datasets (getDataset)
import Numeric.Datasets.Iris (iris)
import Numeric.Datasets.Abalone (abalone)

main = do
  -- The Iris data set is embedded
  print (length iris)
  print (head iris)
  -- The Abalone dataset is fetched
  abas <- getDataset abalone
  print (length abas)
  print (head abas)

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0, 0.1.0.1, 0.2, 0.2.0.1, 0.2.0.2, 0.2.0.3, 0.2.1, 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.3.0, 0.4.0
Change log changelog.md
Dependencies aeson, attoparsec (>=0.13), base (>=4.6 && <5), bytestring, cassava, directory, file-embed, filepath, hashable, microlens, stringsearch, text, time, vector, wreq [details]
Tested with ghc ==7.10.2, ghc ==7.10.3, ghc ==8.0.1
License MIT
Author Tom Nielsen <tanielsen@gmail.com>
Maintainer Tom Nielsen <tanielsen@gmail.com>
Category Statistics, Machine Learning, Data Mining, Data
Home page https://github.com/filopodia/open/datasets
Bug tracker https://github.com/filopodia/open/issues
Source repo head: git clone https://github.com/filopodia/open
Uploaded by glutamate at 2017-04-20T08:00:30Z
Distributions
Reverse Dependencies 1 direct, 1 indirect [details]
Downloads 10135 total (47 in the last 30 days)
Rating 2.0 (votes: 1) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2017-04-20 [all 1 reports]