aivika-experiment-2.1: Simulation experiments for the Aivika library

Stability experimental David Sorokin Safe-Inferred

Simulation.Aivika.Experiment.Histogram

Description

Tested with: GHC 7.6.3

| This module computes the histogram by the specified data and strategy applied for such computing.

The code in this module is essentially based on the http://hackage.haskell.org/package/Histogram package by Mike Izbicki, who kindly agreed to re-license his library under BSD3, which allowed me to use his code and comments with some modifications.

Synopsis

# Creating Histograms

type Histogram = [(Double, [Int])]Source

Holds all the information needed to plot the histogram for a list of different series. Each series produces its own item in the resuling `[Int]` list that may contain zeros.

histogram :: BinningStrategy -> [[Double]] -> HistogramSource

Creates a histogram by specifying the list of series. Call it with one of the binning strategies that is appropriate to the type of data you have. If you don't know, then try using `binSturges`.

histogramBinSize :: Double -> [[Double]] -> HistogramSource

Create a histogram by specifying the exact bin size. You probably don't want to use this function, and should use histogram with an appropriate binning strategy.

histogramNumBins :: Int -> [[Double]] -> HistogramSource

Create a histogram by the specified approximated number of bins. You probably don't want to use this function, and should use histogram with an appropriate binning strategy.

# Binning Strategies

type BinningStrategy = [Double] -> IntSource

The strategy applied to calculate the histogram bins.

Sturges' binning strategy is the least computational work, but recommended for only normal data.

Doane's binning strategy extends Sturges' for non-normal data. It takes a little more time because it must calculate the kurtosis (peakkiness) of the distribution.

Using the sqrt of the number of samples is not supported by any theory, but is commonly used by excel and other histogram making software.

Scott's rule is the optimal solution for normal data, but requires more computation than Sturges'.