datasets: Classical data sets for statistics and machine learning

[ data, data-mining, library, machine-learning, mit, statistics ] [ Propose Tags ]

Classical machine learning and statistics datasets from the UCI Machine Learning Repository and other sources.

The datasets package defines two different kinds of datasets:

  • small data sets which are directly (or indirectly with `file-embed`) embedded in the package as pure values and do not require network or IO to download the data set. This includes Iris, Anscombe and OldFaithful.

  • other data sets which need to be fetched over the network with Numeric.Datasets.getDataset and are cached in a local temporary directory.

import Numeric.Datasets (getDataset)
import Numeric.Datasets.Iris (iris)
import Numeric.Datasets.Abalone (abalone)

main = do
  -- The Iris data set is embedded
  print (length iris)
  print (head iris)
  -- The Abalone dataset is fetched
  abas <- getDataset abalone
  print (length abas)
  print (head abas)

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0, 0.1.0.1, 0.2, 0.2.0.1, 0.2.0.2, 0.2.0.3, 0.2.1, 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.3.0, 0.4.0
Change log changelog.md
Dependencies aeson, base (>=4.8 && <5), bytestring, cassava, directory, file-embed, filepath, hashable, HTTP, stringsearch, text, time, vector [details]
License MIT
Author Tom Nielsen <tanielsen@gmail.com>
Maintainer Tom Nielsen <tanielsen@gmail.com>
Revised Revision 1 made by HerbertValerioRiedel at 2019-02-12T09:04:14Z
Category Statistics, Machine Learning, Data Mining, Data
Home page https://github.com/glutamate/datasets
Bug tracker https://github.com/glutamate/datasets/issues
Source repo head: git clone https://github.com/glutamate/datasets
Uploaded by glutamate at 2016-11-27T14:58:41Z
Distributions
Reverse Dependencies 1 direct, 1 indirect [details]
Downloads 9870 total (42 in the last 30 days)
Rating 2.0 (votes: 1) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2016-11-27 [all 1 reports]