hyperloglog-0.3.1: An approximate streaming (constant space) unique object counter

Copyright(c) Edward Kmett 2013-2015
LicenseBSD3
MaintainerEdward Kmett <ekmett@gmail.com>
Stabilityexperimental
Portabilitynon-portable
Safe HaskellNone
LanguageHaskell98

Data.HyperLogLog.Type

Contents

Description

This package provides an approximate streaming (constant space) unique object counter.

See the original paper for details: http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf

Synopsis

HyperLogLog

newtype HyperLogLog p Source

Initialize a new counter:

>>> mempty :: HyperLogLog $(3)
HyperLogLog {runHyperLogLog = fromList [0,0,0,0,0,0,0,0]}

Please note how you specify a counter size with the $(n) invocation. Sizes of up to 16 are valid, with 7 being a likely good minimum for decent accuracy.

Let's count a list of unique items and get the latest estimate:

>>> size (foldr insert mempty [1..10] :: HyperLogLog $(4))
Approximate {_confidence = 0.9972, _lo = 2, _estimate = 9, _hi = 17}

Note how insert can be used to add new observations to the approximate counter.

Constructors

HyperLogLog 

class HasHyperLogLog a p | a -> p where Source

Instances

size :: ReifiesConfig p => HyperLogLog p -> Approximate Int64 Source

Approximate size of our set

insertHash :: ReifiesConfig s => Word32 -> HyperLogLog s -> HyperLogLog s Source

Insert a value that has already been hashed by whatever user defined hash function you want.