Stability	experimental
Maintainer	Daniël de Kok <me@danieldk.eu>
Safe Haskell	None

Statistics.Test.ApproxRand

Contents

Description
Examples
Data types
Approximate randomization tests
Test statistics

Description

This module provides functionality to perform approximate randomization tests (Noreen, 1989).

Synopsis

Description

Approximate randomization tests rely on a simple premise: given a test statistic, if the null-hypothesis (the samples do not differ) is true, we can randomly swap values between samples without an (extreme) impact on the test statistic. Otherwise, the null-hypothesis must be rejected.

The test works by generating a given number of sample shuffles and computing the test statistic for each shuffle. If r is the number of shuffled samples where the test statistic is at least as high as the test statistic applied on the original samples; and N the number of shuffles, then the null-hypothesis is rejected iff (r + 1):(N + 1) < p-value (for one-sided tests).

Two kinds of test are supported:

Paired sample (approxRandPairTest): values from samples are shuffled pair-wise. This requires the samples to have an equal length.
Unpaired sample (approxRandTest): values from samples are shuffled among both samples. Consequently the i-th element of one sample does not bear a relationship with the i-th element of the other sample. The shuffled samples retain the sizes of the original samples.

Both tests can be performed as a one-tailed or two-tailed test.

Examples

Both unpaired and paired sample tests use the Rand monad to obtain random numbers. We can obtain a pseudo-random number generator that is seeded using the system clock using the newPureMT function (please refer to the documentation of Pure64 for more information):

 prng <- newPureMT

Suppose that we have the samples s1 and s2. We could now perform a Two-Tailed randomization test with 10,000 shuffles and the mean difference as the test statistic, by running approxRandTest in the Rand monad (at the p = 0.01 level):

 evalRandom (approxRandTest TwoTailed meanDifference 10000 0.01 s1 s2) prng

It is also possible to obtain the test statistics of the shuffled samples directly (e.g. to inspect the distribution of test statistics) using the 'approxRandStats'/'approxRandPiarStats' functions:

 evalRandom (approxRandStats meanDifference 10000 0.01 s1 s2) prng

Data types

data TestOptions Source

Options for randomization tests

Constructors

TestOptions
Fields toTestType :: TestType Type of test (`OneTailed` or `TwoTailed`) toTestStatistic :: TestStatistic Test statistic toIterations :: Int Number of shuffled samples to create toPValue :: Double he p-value at which to test (e.g. 0.05)

data TestResult Source

The result of hypothesis testing.

Constructors

TestResult
Fields trSignificance :: Significance Significance trStat :: Double Test statistic for the samples trRandomizedStats :: Sample Test statistics for the randomized samples

Instances

Eq TestResult
Ord TestResult
Show TestResult

data Significance Source

Significance.

Constructors

Significant Double	The null hypothesis should be rejected
NotSignificant Double	Data is compatible with the null hypothesis

Instances

Eq Significance
Ord Significance
Show Significance

type RandWithError a = ErrorT String Rand aSource

Computations with random numbers that can fail.

Approximate randomization tests

approxRandTest Source

Arguments

:: TestOptions	Options for the test
-> Sample	First sample
-> Sample	Second sample
-> Rand TestResult	The test result

Apply an approximate randomization test.

In approximate randomization tests, the values of two samples are shuffled among those samples. A test statistic is calculated for the original samples and the shuffled samples, to detect whether the difference of the samples is extreme or not.

approxRandStats Source

Arguments

:: TestStatistic	Test statistic
-> Int	Number of shuffled samples to create
-> Sample	First sample
-> Sample	Second sample
-> Rand Sample	The statistics of the shuffles

Generate a given number of shuffled samples, and calculate the test statistic for each shuffle.

This function does not require the samples to have an equal length.

approxRandPairTest Source

Arguments

:: TestOptions	Options for the test
-> Sample	First sample
-> Sample	Second sample
-> RandWithError TestResult	The test result

Apply a pair-wise approximate randomization test.

In pair-wise approximate randomization tests the data points at a given index are swapped between samples with a probability of 0.5. Since swapping is pairwise, the samples should have the same length.

approxRandPairStats Source

Arguments

:: TestStatistic	Test statistic
-> Int	Number of shuffled samples to create
-> Sample	First sample
-> Sample	Second sample
-> RandWithError Sample	The statistics of the shuffles

Generate a given number of pairwise shuffled samples, and calculate the test statistic for each shuffle.

Since the data points at a given index are swapped (with a probability of 0.5), the samples should have the same length.

Test statistics

type TestStatistic = Sample -> Sample -> Double Source

A test stastic calculates the difference between two samples. See meanDifference and varianceRatio for examples.

differenceMean :: TestStatistic Source

Calculates the difference mean of two samples (mean(s1 - s2)). When the two samples do not have an equal length, the trailing elements of the longer vector are ignored.

meanDifference :: TestStatistic Source

Calculates the mean difference of two samples (mean(s1) - mean(s2)).

varianceRatio :: TestStatistic Source

Calculate the ratio of sample variances (var(s1) : var(s2)).