Portability | portable |
---|---|

Stability | experimental |

Maintainer | bos@serpentine.com |

Functions for performing non-parametric tests (i.e. tests without an assumption of underlying distribution).

- mannWhitneyU :: Sample -> Sample -> (Double, Double)
- mannWhitneyUCriticalValue :: (Int, Int) -> Double -> Maybe Int
- mannWhitneyUSignificant :: Bool -> (Int, Int) -> Double -> (Double, Double) -> Maybe Bool
- wilcoxonMatchedPairSignedRank :: Sample -> Sample -> (Double, Double)
- wilcoxonMatchedPairSignificant :: Bool -> Int -> Double -> (Double, Double) -> Maybe Bool
- wilcoxonMatchedPairSignificance :: Int -> Double -> Double
- wilcoxonMatchedPairCriticalValue :: Int -> Double -> Maybe Int
- wilcoxonRankSums :: Sample -> Sample -> (Double, Double)

# Mann-Whitney U test (non-parametric equivalent to the independent t-test)

mannWhitneyU :: Sample -> Sample -> (Double, Double)Source

The Mann-Whitney U Test.

This is sometimes known as the Mann-Whitney-Wilcoxon U test, and
confusingly many sources state that the Mann-Whitney U test is the same as
the Wilcoxon's rank sum test (which is provided as `wilcoxonRankSums`

).
The Mann-Whitney U is a simple transform of Wilcoxon's rank sum test.

Again confusingly, different sources state reversed definitions for U_1 and U_2,
so it is worth being explicit about what this function returns. Given two samples,
the first, xs_1, of size n_1 and the second, xs_2, of size n_2, this function
returns (U_1, U_2) where U_1 = W_1 - (n_1*(n_1+1))/2 and U_2 = W_2 - (n_2*(n_2+1))/2,
where (W_1, W_2) is the return value of `wilcoxonRankSums xs1 xs2`

.

Some sources instead state that U_1 and U_2 should be the other way round, often expressing this using U_1' = n_1*n_2 - U_1 (since U_1 + U_2 = n_1*n*2).

All of which you probably don't care about if you just feed this into `mannWhitneyUSignificant`

.

mannWhitneyUCriticalValueSource

:: (Int, Int) | The sample size |

-> Double | The p-value (e.g. 0.05) for which you want the critical value. |

-> Maybe Int | The critical value (of U). |

Calculates the critical value of Mann-Whitney U for the given sample sizes and significance level.

This function returns the exact calculated value of U for all sample sizes; it does not use the normal approximation at all. Above sample size 20 it is generally recommended to use the normal approximation instead, but this function will calculate the higher critical values if you need them.

The algorithm to generate these values is a faster, memoised version of the simple unoptimised generating function given in section 2 of "The Mann Whitney Wilcoxon Distribution Using Linked Lists", Cheung and Klotz, Statistica Sinica 7 (1997), http://www3.stat.sinica.edu.tw/statistica/oldpdf/A7n316.pdf.

:: Bool | Perform one-tailed test (see description above). |

-> (Int, Int) | The sample size from which the (U_1,U_2) values were derived. |

-> Double | The p-value at which to test (e.g. 0.05) |

-> (Double, Double) | The (U_1, U_2) values from |

-> Maybe Bool | Just True if the test is significant, Just False if it is not, and Nothing if the sample was too small to make a decision. |

Calculates whether the Mann Whitney U test is significant.

If both sample sizes are less than or equal to 20, the exact U critical value
(as calculated by `mannWhitneyUCriticalValue`

) is used. If either sample is
larger than 20, the normal approximation is used instead.

If you use a one-tailed test, the test indicates whether the first sample is significantly larger than the second. If you want the opposite, simply reverse the order in both the sample size and the (U_1, U_2) pairs.

# Wilcoxon signed-rank matched-pair test (non-parametric equivalent to the paired t-test)

wilcoxonMatchedPairSignedRank :: Sample -> Sample -> (Double, Double)Source

The Wilcoxon matched-pairs signed-rank test.

The value returned is the pair (T+, T-). T+ is the sum of positive ranks (the
ranks of the differences where the first parameter is higher) whereas T- is
the sum of negative ranks (the ranks of the differences where the second parameter is higher).
These values mean little by themselves, and should be combined with the `wilcoxonSignificant`

function in this module to get a meaningful result.

The samples are zipped together: if one is longer than the other, both are truncated to the the length of the shorter sample.

Note that: wilcoxonMatchedPairSignedRank == ((x, y) -> (y, x)) . flip wilcoxonMatchedPairSignedRank

wilcoxonMatchedPairSignificantSource

:: Bool | Perform one-tailed test (see description above). |

-> Int | The sample size from which the (T+,T-) values were derived. |

-> Double | The p-value at which to test (e.g. 0.05) |

-> (Double, Double) | The (T+, T-) values from |

-> Maybe Bool | Just True if the test is significant, Just False if it is not, and Nothing if the sample was too small to make a decision. |

Tests whether a given result from a Wilcoxon signed-rank matched-pairs test is significant at the given level.

This function can perform a one-tailed or two-tailed test. If the first
parameter to this function is False, the test is performed two-tailed to
check if the two samples differ significantly. If the first parameter is
True, the check is performed one-tailed to decide whether the first sample
(i.e. the first sample you passed to `wilcoxonMatchedPairSignedRank`

) is
greater than the second sample (i.e. the second sample you passed to
`wilcoxonMatchedPairSignedRank`

). If you wish to perform a one-tailed test
in the opposite direction, you can either pass the parameters in a different
order to `wilcoxonMatchedPairSignedRank`

, or simply swap the values in the resulting
pair before passing them to this function.

wilcoxonMatchedPairSignificanceSource

:: Int | The sample size |

-> Double | The value of T for which you want the significance. |

-> Double | ^ The significance (p-value). |

Works out the significance level (p-value) of a T value, given a sample size and a T value from the Wilcoxon signed-rank matched-pairs test.

See the notes on `wilcoxonCriticalValue`

for how this is calculated.

wilcoxonMatchedPairCriticalValueSource

:: Int | The sample size |

-> Double | The p-value (e.g. 0.05) for which you want the critical value. |

-> Maybe Int | The critical value (of T), or Nothing if the sample is too small to make a decision. |

Obtains the critical value of T to compare against, given a sample size and a p-value (significance level). Your T value must be less than or equal to the return of this function in order for the test to work out significant. If there is a Nothing return, the sample size is too small to make a decision.

`wilcoxonSignificant`

tests the return value of `wilcoxonMatchedPairSignedRank`

for you, so you should use `wilcoxonSignificant`

for determining test results.
However, this function is useful, for example, for generating lookup tables
for Wilcoxon signed rank critical values.

The return values of this function are generated using the method detailed in the paper "Critical Values for the Wilcoxon Signed Rank Statistic", Peter Mitic, The Mathematica Journal, volume 6, issue 3, 1996, which can be found here: http://www.mathematica-journal.com/issue/v6i3/article/mitic/contents/63mitic.pdf. According to that paper, the results may differ from other published lookup tables, but (Mitic claims) the values obtained by this function will be the correct ones.

# Wilcoxon rank sum test

wilcoxonRankSums :: Sample -> Sample -> (Double, Double)Source

The Wilcoxon Rank Sums Test.

This test calculates the sum of ranks for the given two samples. The samples are ordered, and assigned ranks (ties are given their average rank), then these ranks are summed for each sample.

The return value is (W_1, W_2) where W_1 is the sum of ranks of the first sample
and W_2 is the sum of ranks of the second sample. This test is trivially transformed
into the Mann-Whitney U test. You will probably want to use `mannWhitneyU`

and the related functions for testing significance, but this function is exposed
for completeness.