The tdigest package

[ Tags: bsd3, library, numeric ] [ Propose Tags ]

A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means.

See original paper: "Computing extremely accurate quantiles using t-digest" by Ted Dunning and Otmar Ertl for more details https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf.


[Skip to Readme]

Properties

Versions 0, 0.1
Change log CHANGELOG.md
Dependencies base (>=4.7 && <4.11), base-compat (>=0.9.1 && <0.10), binary (>=0.7.1.0 && <0.10), deepseq (>=1.3.0.2 && <1.5), reducers (>=3.12.1 && <3.13), semigroupoids (>=5.1 && <5.3), semigroups (>=0.18.2 && <0.19), vector (>=0.11 && <0.13), vector-algorithms (>=0.7.0.1 && <0.8) [details]
License BSD3
Author Oleg Grenrus <oleg.grenrus@iki.fi>
Maintainer Oleg Grenrus <oleg.grenrus@iki.fi>
Category Numeric
Home page https://github.com/futurice/haskell-tdigest#readme
Bug tracker https://github.com/futurice/haskell-tdigest/issues
Source repository head: git clone https://github.com/futurice/haskell-tdigest
Uploaded Wed Mar 8 12:31:14 UTC 2017 by phadej
Updated Thu Jul 27 22:31:25 UTC 2017 by phadej to revision 2
Distributions LTSHaskell:0.1, NixOS:0.1, Stackage:0.1, Tumbleweed:0.1
Downloads 117 total (9 in the last 30 days)
Rating 0.0 (0 ratings) [clear rating]
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2017-03-08 [all 1 reports]
Hackage Matrix CI

Modules

[Index]

Downloads

Maintainer's Corner

For package maintainers and hackage trustees


Readme for tdigest-0.1

[back to package description]

tdigest

A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means.

See original paper: "Computing extremely accurate quantiles using t-digest" by Ted Dunning and Otmar Ertl

Synopsis

λ *Data.TDigest > median (tdigest [1..1000] :: TDigest 3)
Just 499.0090729817737

Benchmarks

Using 50M exponentially distributed numbers:

  • average: 16s; incorrect approximation of median, mostly to measure prng speed
  • sorting using vector-algorithms: 33s; using 1000MB of memory
  • sparking t-digest (using some par): 53s
  • buffered t-digest: 68s
  • sequential t-digest: 65s

Example histogram

tdigest-simple -m tdigest -d standard -s 100000 -c 10 -o output.svg -i 34
cp output.svg example.svg
inkscape --export-png=example.png --export-dpi=80 --export-background-opacity=0 --without-gui example.svg

Example