datasets: Classical data sets for statistics and machine learning

[ data, data-mining, library, machine-learning, mit, statistics ] [ Propose Tags ]

Classical machine learning and statistics datasets from the UCI Machine Learning Repository and other sources.

The datasets package defines two different kinds of datasets:

  • small data sets which are directly (or indirectly with `file-embed`) embedded in the package as pure values and do not require network or IO to download the data set. This includes Iris, Anscombe and OldFaithful.

  • other data sets which need to be fetched over the network with Numeric.Datasets.getDataset and are cached in a local temporary directory.

import Numeric.Datasets (getDataset)
import Numeric.Datasets.Iris (iris)
import Numeric.Datasets.Abalone (abalone)

main = do
  -- The Iris data set is embedded
  print (length iris)
  print (head iris)
  -- The Abalone dataset is fetched
  abas <- getDataset abalone
  print (length abas)
  print (head abas)

Modules

[Last Documentation]

  • Numeric
    • Numeric.Datasets
      • Numeric.Datasets.Abalone
      • Numeric.Datasets.Adult
      • Numeric.Datasets.Anscombe
      • Numeric.Datasets.BostonHousing
      • Numeric.Datasets.BreastCancerWisconsin
      • Numeric.Datasets.Car
      • Numeric.Datasets.Iris
      • Numeric.Datasets.Michelson
      • Numeric.Datasets.Nightingale
      • Numeric.Datasets.OldFaithful
      • Numeric.Datasets.Wine

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0, 0.1.0.1, 0.2, 0.2.0.1, 0.2.0.2, 0.2.0.3, 0.2.1, 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.3.0, 0.4.0
Dependencies aeson, base (<0), bytestring, cassava, directory, file-embed, filepath, hashable, HTTP, stringsearch, text, time, vector [details]
License MIT
Author Tom Nielsen <tanielsen@gmail.com>
Maintainer Tom Nielsen <tanielsen@gmail.com>
Revised Revision 1 made by HerbertValerioRiedel at 2016-11-27T15:01:32Z
Category Statistics, Machine Learning, Data Mining, Data
Home page https://github.com/glutamate/datasets
Bug tracker https://github.com/glutamate/datasets/issues
Source repo head: git clone https://github.com/glutamate/datasets
Uploaded by glutamate at 2016-11-27T10:21:29Z
Distributions
Reverse Dependencies 1 direct, 1 indirect [details]
Downloads 9844 total (43 in the last 30 days)
Rating 2.0 (votes: 1) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs not available [build log]
All reported builds failed as of 2016-11-27 [all 3 reports]