:      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFG H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ `abcdefghijklmnopqrstuvwxyz{|}~portable experimentalbos@serpentine.com=Just like unsafePerformIO, but we inline it. Big performance 9 gains as it exposes lots of things to further inlining. /Very  unsafe/;. In particular, you should do no memory allocation inside  an  block. On Hugs this is just unsafePerformIO. portable experimentalbos@serpentine.comSort a vector. -Partially sort a vector, such that the least k elements will be  at the front.  The number k of least elements.  Return the indices of a vector. Zip a vector with its indices. 9Compute the minimum and maximum of a vector in one pass. 9Create a vector, using the given action to populate each  element. portable experimentalbos@serpentine.com>Weights for affecting the importance of elements of a sample. >A function that estimates a property of a sample, such as its  mean. GSample with weights. First element of sample is data, second is weight  Sample data.    portable experimentalbos@serpentine.com @A resample drawn randomly, with replacement, from a set of data B points. Distinct from a normal array to make it harder for your  humble author's brain to go wrong. AResample a data set repeatedly, with replacement, computing each # estimate over the resampled data. >Compute a statistical estimate repeatedly over a sample, each % time omitting a successive element.  Drop the kth element of a vector.      portable experimentalbos@serpentine.com,Type class for distributions with variance. (Type class for distributions with mean. %Continuous probability distributuion 6Probability density function. Probability that random  variable X$ lies in the infinitesimal interval  [x,x+x ) equal to  density(x)"x ;Inverse of the cumulative distribution function. The value  x for which P(X"dx) = p. #Discrete probability distribution. Probability of n-th outcome. =Type class common to all distributions. Only c.d.f. could be 8 defined for both discrete and continous distributions. :Cumulative distribution function. The probability that a  random variable X is less or equal than x,  i.e. P(X"dx). Approximate the value of X for which P(x>X)=p. ?This method uses a combination of Newton-Raphson iteration and D bisection with the given guess as a starting point. The upper and < lower bounds specify the interval in which the probability  distribution reaches the value p.  Distribution  Probability p Initial guess Lower bound on interval Upper bound on interval )Sum probabilities in inclusive interval.   portable experimentalbos@serpentine.comCreate geometric distribution  Success rate portable experimentalbos@serpentine.com A very large number.  The largest  x such that 2**(x-1) is approximately  representable as a . ! sqrt 2"  sqrt (2 * pi)# 2 / sqrt pi$ 1 / sqrt 2% The smallest   such that 1 +  "` 1. & log(sqrt((2*pi)) / 2'Positive infinity. (Negative infinity. )Not a number.  !"#$%&'() %$#& !"'()  !"#$%&'()portable experimentalbos@serpentine.com * Parameters a and b to the - function. +,O(n log n). Estimate the kth q-quantile of a sample, $ using the weighted average method. k, the desired quantile. q, the number of quantiles. x, the sample data. -O(n log n). Estimate the kth q-quantile of a sample x, E using the continuous sample method with the given parameters. This = is the method used by most statistical software, such as R,  Mathematica, SPSS, and S.  Parameters a and b. k, the desired quantile. q, the number of quantiles. x, the sample data. .O(n log n). Estimate the range between q-quantiles 1 and  q-1 of a sample x., using the continuous sample method with the  given parameters. @For instance, the interquartile range (IQR) can be estimated as  follows: 5 midspread medianUnbiased 4 (U.fromList [1,1,2,2,3])  ==> 1.333333  Parameters a and b. q, the number of quantiles. x, the sample data. /2California Department of Public Works definition, a=0, b=1. : Gives a linear interpolation of the empirical CDF. This / corresponds to method 4 in R and Mathematica. 0Hazen's definition, a=0.5, b=0.5. This is claimed to be D popular among hydrologists. This corresponds to method 5 in R and  Mathematica. 19Definition used by the SPSS statistics application, with a=0,  b=0 (also known as Weibull'$s definition). This corresponds to  method 6 in R and Mathematica. 26Definition used by the S statistics application, with a=1,  b;=1. The interpolation points divide the sample range into n-1 @ intervals. This corresponds to method 7 in R and Mathematica. 3Median unbiased definition, a=1/3, b=1/3. The resulting D quantile estimates are approximately median unbiased regardless of  the distribution of x). This corresponds to method 8 in R and  Mathematica. 4Normal unbiased definition, a=3/8, b=3/8. An approximately B unbiased estimate if the empirical distribution approximates the = normal distribution. This corresponds to method 9 in R and  Mathematica. *+,-./01234 ,*+-./02134 *++,-./01234portable experimentalbos@serpentine.com56#Arithmetic mean. This uses Welford's algorithm to provide @ numerical stability, using a single pass over the sample data. 7AArithmetic mean for weighted sample. It uses algorithm analogous  to one in 6 8?Harmonic mean. This algorithm performs a single pass over the  sample. 9:Geometric mean of a sample containing no negative values. : Compute the k3th central moment of a sample. The central moment - is also known as the moment about the mean. EThis function performs two passes over the sample, so is not subject  to stream fusion. @For samples containing many values very close to the mean, this E function is subject to inaccuracy due to catastrophic cancellation. ; Compute the kth and j th central moments of a sample. EThis function performs two passes over the sample, so is not subject  to stream fusion. @For samples containing many values very close to the mean, this E function is subject to inaccuracy due to catastrophic cancellation. <;Compute the skewness of a sample. This is a measure of the  asymmetry of its distribution. *A sample with negative skew is said to be  left-skewed . Most of D its mass is on the right of the distribution, with the tail on the  left.  % skewness $ U.to [1,100,101,102,103]  ==> -1.497681449918257 *A sample with positive skew is said to be  right-skewed.   skewness $ U.to [1,2,3,4,100]  ==> 1.4975367033335198 A sample'!s skewness is not defined if its > is zero. EThis function performs two passes over the sample, so is not subject  to stream fusion. @For samples containing many values very close to the mean, this E function is subject to inaccuracy due to catastrophic cancellation. =?Compute the excess kurtosis of a sample. This is a measure of  the " peakedness"1 of its distribution. A high kurtosis indicates  that more of the sample''s variance is due to infrequent severe : deviations, rather than more frequent modest deviations. A sample'(s excess kurtosis is not defined if its > is  zero. EThis function performs two passes over the sample, so is not subject  to stream fusion. @For samples containing many values very close to the mean, this E function is subject to inaccuracy due to catastrophic cancellation. >'Maximum likelihood estimate of a sample's variance. Also known 6 as the population variance, where the denominator is n. ?Unbiased estimate of a sample's variance. Also known as the + sample variance, where the denominator is n-1. @ACalculate mean and maximum likelihood estimate of variance. This @ function should be used if both mean and variance are required ) since it will calculate mean only once. A7Calculate mean and unbiased estimate of variance. This @ function should be used if both mean and variance are required ) since it will calculate mean only once. B;Standard deviation. This is simply the square root of the $ unbiased estimate of the variance. C.Weighted variance. This is biased estimation. D'Maximum likelihood estimate of a sample' s variance. EUnbiased estimate of a sample' s variance. F;Standard deviation. This is simply the square root of the . maximum likelihood estimate of the variance.  56789:;<=>?@ABCDEF 56789:;<=>?@ABCDEF56789:;<=>?@ABCDEF portable experimentalbos@serpentine.comGHI Create exponential distribution  (scale) parameter. JBCreate exponential distribution from sample. No tests are made to % check whether it really exponential GHIJGHIJHGHHIJ portable experimentalbos@serpentine.com KThe normal distribution. LJStandard normal distribution with mean equal to 0 and variance equal to 1 M+Create normal distribution from parameters Mean of distribution Variance of distribution N4Create distribution using parameters estimated from A sample. Variance is estimated using maximum likelihood method  (biased estimation). KLMNKMNLKLMN portable experimentalbos@serpentine.comO8Evaluate a Chebyshev polynomial of the first kind. Uses  Clenshaw' s algorithm. Parameter of each function. ;Coefficients of each polynomial term, in increasing order. P?Evaluate a Chebyshev polynomial of the first kind. Uses Broucke's F ECHEB algorithm, and his convention for coefficient handling, and so  gives different results than O for the same inputs. Parameter of each function. ;Coefficients of each polynomial term, in increasing order. )Quickly compute the natural logarithm of n Q k, with  no checking. Q!Compute the binomial coefficient n `Q` k. For  values of k2 > 30, this uses an approximation for performance E reasons. The approximation is accurate to 12 decimal places in the  worst case  Example:  7 `choose` 3 == 35 RCompute the factorial function n !. Returns " if the E input is above 170 (above which the result cannot be represented by  a 64-bit ). S@Compute the natural logarithm of the factorial function. Gives ! 16 decimal digits of precision. T7Compute the normalized lower incomplete gamma function  (s,x). Normalization means that  (s,"$)=1. Uses Algorithm AS 239 by Shea. s x U,Compute the logarithm of the gamma function (x ). Uses  Algorithm AS 245 by Macleod. Gives an accuracy of 10 &12 significant decimal digits, except  for small regions around x = 1 and x = 2, where the function * goes to zero. For greater accuracy, use V. Returns ") if the input is outside of the range (0 < x  "d 1e305). V-Compute the logarithm of the gamma function, (x ). Uses a  Lanczos approximation. This function is slower than U, but gives 14 or more 7 significant decimal digits of accuracy, except around x = 1 and  x' = 2, where the function goes to zero. Returns ") if the input is outside of the range (0 < x  "d 1e305). ,Compute the log gamma correction factor for x "e 10. This : correction factor is suitable for an alternate (but less % numerically accurate) definition of U: Elgg x = 0.5 * log(2*pi) + (x-0.5) * log x - x + logGammaCorrection x W4Compute the natural logarithm of the beta function. X%Compute the natural logarithm of 1 + x. This is accurate even  for values of x near zero, where use of log(1+x) would lose  precision. OPQRSTUVWX QWOPRSTUVX OPQRSTUVWX portable experimentalbos@serpentine.com YThe binomial distribution. ZNumber of trials. [ Probability. \ Construct binomial distribution Number of trials.  Probability. YZ[\YZ[\Z[YZ[Z[\ portable experimentalbos@serpentine.com]Chi-squared distribution ^!Get number of degrees of freedom _>Construct chi-squared distribution. Number of degrees of free ]^_]_^]^_portable experimentalbos@serpentine.com`The gamma distribution. aShape parameter, k. bScale parameter, . c`abc`abcab`ababcportable experimentalbos@serpentine.com defghm l k defghdefghefgdefgefghportable experimentalbos@serpentine.comijk Create po ijkijkjijjkportable experimentalbos@serpentine.com lmO(n) Collect the n simple powers of a sample.  Functions computed over a sample'#s simple powers require at least a  certain number (or order) of powers to be collected.  To compute the kth o , at least k simple powers  must be collected.  For the p', at least 2 simple powers are needed.  For s$, we need at least 3 simple powers.  For t), at least 4 simple powers are required. +This function is subject to stream fusion. n, the number of powers, where n >= 2. n5The order (number) of simple powers collected from a sample. o Compute the k,th central moment of a sample. The central 4 moment is also known as the moment about the mean. p'Maximum likelihood estimate of a sample's variance. Also known 6 as the population variance, where the denominator is n . This is * the second central moment of the sample. BThis is less numerically robust than the variance function in the  Statistics.Sample/ module, but the number is essentially free to / compute if you have already collected a sample's simple powers.  Requires l with n at least 2. q;Standard deviation. This is simply the square root of the . maximum likelihood estimate of the variance. rUnbiased estimate of a sample's variance. Also known as the + sample variance, where the denominator is n-1.  Requires l with n at least 2. s;Compute the skewness of a sample. This is a measure of the  asymmetry of its distribution. *A sample with negative skew is said to be  left-skewed . Most of D its mass is on the right of the distribution, with the tail on the  left.  0 skewness . powers 3 $ U.to [1,100,101,102,103]  ==> -1.497681449918257 *A sample with positive skew is said to be  right-skewed.  * skewness . powers 3 $ U.to [1,2,3,4,100]  ==> 1.4975367033335198 A sample'!s skewness is not defined if its p is zero.  Requires l with n at least 3. t?Compute the excess kurtosis of a sample. This is a measure of  the " peakedness"1 of its distribution. A high kurtosis indicates  that the sample',s variance is due more to infrequent severe 0 deviations than to frequent modest deviations. A sample'(s excess kurtosis is not defined if its p is  zero.  Requires l with n at least 4. u'The number of elements in the original Sample. This is the  sample's zeroth simple power. v$The sum of elements in the original Sample. This is the  sample's first simple power. w0The arithmetic mean of elements in the original Sample. >This is less numerically robust than the mean function in the  Statistics.Sample/ module, but the number is essentially free to / compute if you have already collected a sample's simple powers. lmnopqrstuvw lmnuvwpqrost lmnopqrstuvwportable experimentalbos@serpentine.com xThe Wilcoxon Rank Sums Test. NThis test calculates the sum of ranks for the given two samples. The samples Q are ordered, and assigned ranks (ties are given their average rank), then these # ranks are summed for each sample. QThe return value is (W_1, W_2) where W_1 is the sum of ranks of the first sample W and W_2 is the sum of ranks of the second sample. This test is trivially transformed > into the Mann-Whitney U test. You will probably want to use y R and the related functions for testing significance, but this function is exposed  for completeness. yThe Mann-Whitney U Test. AThis is sometimes known as the Mann-Whitney-Wilcoxon U test, and L confusingly many sources state that the Mann-Whitney U test is the same as  the Wilcoxon'&s rank sum test (which is provided as x). 5 The Mann-Whitney U is a simple transform of Wilcoxon's rank sum test. QAgain confusingly, different sources state reversed definitions for U_1 and U_2, U so it is worth being explicit about what this function returns. Given two samples, O the first, xs_1, of size n_1 and the second, xs_2, of size n_2, this function 3 returns (U_1, U_2) where U_1 = W_1 - (n_1*(n_1+1))/2 and U_2 = W_2 - (n_2*(n_2+1))/2, ) where (W_1, W_2) is the return value of wilcoxonRankSums xs1 xs2. QSome sources instead state that U_1 and U_2 should be the other way round, often  expressing this using U_1'. = n_1*n_2 - U_1 (since U_1 + U_2 = n_1*n*2). All of which you probably don'(t care about if you just feed this into {. zECalculates the critical value of Mann-Whitney U for the given sample  sizes and significance level. LThis function returns the exact calculated value of U for all sample sizes; N it does not use the normal approximation at all. Above sample size 20 it is R generally recommended to use the normal approximation instead, but this function = will calculate the higher critical values if you need them. LThe algorithm to generate these values is a faster, memoised version of the > simple unoptimised generating function given in section 2 of "The Mann Whitney ) Wilcoxon Distribution Using Linked Lists"&, Cheung and Klotz, Statistica Sinica  7 (1997),  ;http://www3.stat.sinica.edu.tw/statistica/oldpdf/A7n316.pdf. The sample size ?The p-value (e.g. 0.05) for which you want the critical value. The critical value (of U). {;Calculates whether the Mann Whitney U test is significant. NIf both sample sizes are less than or equal to 20, the exact U critical value  (as calculated by z ) is used. If either sample is ; larger than 20, the normal approximation is used instead. MIf you use a one-tailed test, the test indicates whether the first sample is Q significantly larger than the second. If you want the opposite, simply reverse = the order in both the sample size and the (U_1, U_2) pairs. 1Perform one-tailed test (see description above). >The sample size from which the (U_1,U_2) values were derived. )The p-value at which to test (e.g. 0.05) The (U_1, U_2) values from y. +Just True if the test is significant, Just / False if it is not, and Nothing if the sample # was too small to make a decision. |-The Wilcoxon matched-pairs signed-rank test. OThe value returned is the pair (T+, T-). T+ is the sum of positive ranks (the M ranks of the differences where the first parameter is higher) whereas T- is ` the sum of negative ranks (the ranks of the differences where the second parameter is higher). I These values mean little by themselves, and should be combined with the wilcoxonSignificant 5 function in this module to get a meaningful result. UThe samples are zipped together: if one is longer than the other, both are truncated * to the the length of the shorter sample. -Note that: wilcoxonMatchedPairSignedRank == ((6x, y) -> (y, x)) . flip wilcoxonMatchedPairSignedRank ;The coefficients for x^0, x^1, x^2, etc, in the expression  p9rod_{r=1}^s (1 + x^r). See the Mitic paper for details. We can define:  f(1) = 1 + x  f(r) = (1 + x^r)*f(r-1)  = f(r-1) + x^r * f(r-1) ; The effect of multiplying the equation by x^r is to shift * all the coefficients by r down the list. 2This list will be processed lazily from the head. }LTests whether a given result from a Wilcoxon signed-rank matched-pairs test $ is significant at the given level. IThis function can perform a one-tailed or two-tailed test. If the first J parameter to this function is False, the test is performed two-tailed to K check if the two samples differ significantly. If the first parameter is L True, the check is performed one-tailed to decide whether the first sample & (i.e. the first sample you passed to |) is F greater than the second sample (i.e. the second sample you passed to  |-). If you wish to perform a one-tailed test N in the opposite direction, you can either pass the parameters in a different  order to |-, or simply swap the values in the resulting , pair before passing them to this function. 1Perform one-tailed test (see description above). <The sample size from which the (T+,T-) values were derived. )The p-value at which to test (e.g. 0.05) The (T+, T-) values from |. +Just True if the test is significant, Just / False if it is not, and Nothing if the sample # was too small to make a decision. ~HObtains the critical value of T to compare against, given a sample size H and a p-value (significance level). Your T value must be less than or H equal to the return of this function in order for the test to work out M significant. If there is a Nothing return, the sample size is too small to  make a decision. wilcoxonSignificant tests the return value of |  for you, so you should use wilcoxonSignificant for determining test results. N However, this function is useful, for example, for generating lookup tables + for Wilcoxon signed rank critical values. NThe return values of this function are generated using the method detailed in  the paper "6Critical Values for the Wilcoxon Signed Rank Statistic", Peter M Mitic, The Mathematica Journal, volume 6, issue 3, 1996, which can be found  here:  Phttp://www.mathematica-journal.com/issue/v6i3/article/mitic/contents/63mitic.pdf. Y According to that paper, the results may differ from other published lookup tables, but O (Mitic claims) the values obtained by this function will be the correct ones. The sample size ?The p-value (e.g. 0.05) for which you want the critical value. )The critical value (of T), or Nothing if - the sample is too small to make a decision. HWorks out the significance level (p-value) of a T value, given a sample F size and a T value from the Wilcoxon signed-rank matched-pairs test. See the notes on wilcoxonCriticalValue for how this is calculated. The sample size 4The value of T for which you want the significance. ^ The significance (p-value). xyz{|}~yz{|}~xxyz{|}~portable experimentalbos@serpentine.com8The convolution kernel. Its parameters are as follows:  Scaling factor, 1/nh  Bandwidth, h ' A point at which to sample the input, p  One sample value, v *The width of the convolution kernel used. Points from the range of a Sample. 0Bandwidth estimator for an Epanechnikov kernel. +Bandwidth estimator for a Gaussian kernel. CCompute the optimal bandwidth from the observed data for the given  kernel. >Choose a uniform range of points at which to estimate a sample's  probability density function. 7If you are using a Gaussian kernel, multiply the sample' s bandwidth * by 3 before passing it to this function. AIf this function is passed an empty vector, it returns values of ! positive and negative infinity. Number of points to select, n Sample bandwidth, h  Input data AEpanechnikov kernel for probability density function estimation. =Gaussian kernel for probability density function estimation. <Kernel density estimator, providing a non-parametric way of * estimating the PDF of a random variable. Kernel function  Bandwidth, h  Sample data Points at which to estimate BA helper for creating a simple kernel density estimation function < with automatically chosen bandwidth and estimation points. Bandwidth function Kernel function EBandwidth scaling factor (3 for a Gaussian kernel, 1 for all others) &Number of points at which to estimate  sample data ;Simple Epanechnikov kernel density estimator. Returns the D uniformly spaced points from the sample range at which the density < function was estimated, and the estimates at those points. &Number of points at which to estimate  Data sample ASimple Gaussian kernel density estimator. Returns the uniformly C spaced points from the sample range at which the density function 3 was estimated, and the estimates at those points. &Number of points at which to estimate  Data sample portable experimentalbos@serpentine.com .A point and interval estimate computed via an . Point estimate. >Lower bound of the estimate interval (i.e. the lower bound of  the confidence interval). >Upper bound of the estimate interval (i.e. the upper bound of  the confidence interval). .Confidence level of the confidence intervals. BBias-corrected accelerated (BCA) bootstrap. This adjusts for both 2 bias and skewness in the resampled distribution. Confidence level  Sample data  Estimators Resampled data portable experimentalbos@serpentine.com?Compute the autocovariance of a sample, i.e. the covariance of 1 the sample against a shifted version of itself. @Compute the autocorrelation function of a sample, and the upper < and lower bounds of confidence intervals for each element. Note;: The calculation of the 95% confidence interval assumes a  stationary Gaussian process.  !!"#$%&'()*+,-./0123456789:;<=>?@@ABCDEFGHIJ(KLMNOPQ&RSTUVWXY Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q rstuvwxyz{|}~N&URPQ(-/  / * +  ( & * / +          - / ( & p / * +*/+&(--statistics-0.8.0.5Statistics.FunctionStatistics.TypesStatistics.ResamplingStatistics.Distribution!Statistics.Distribution.GeometricStatistics.ConstantsStatistics.QuantileStatistics.Sample#Statistics.Distribution.ExponentialStatistics.Distribution.NormalStatistics.Math Statistics.Distribution.Binomial"Statistics.Distribution.ChiSquaredStatistics.Distribution.Gamma&Statistics.Distribution.HypergeometricStatistics.Distribution.PoissonStatistics.Sample.PowersStatistics.Test.NonParametricStatistics.KernelDensityStatistics.Resampling.BootstrapStatistics.AutocorrelationStatistics.Internalsort partialSortindicesindexedminMaxcreateWeights EstimatorWeightedSampleSampleResample fromResampleresample jackknifeVariancevarianceMeanmean ContDistrdensityquantile DiscreteDistr probability Distribution cumulativefindRootsumProbabilitiesGeometricDistribution gdSuccess geometricm_huge m_max_expm_sqrt_2 m_sqrt_2_pi m_2_sqrt_pi m_1_sqrt_2 m_epsilonm_ln_sqrt_2_pi m_pos_inf m_neg_infm_NaN ContParam weightedAvg continuousBy midspreadcadpwhazenspsssmedianUnbiasednormalUnbiasedrange meanWeighted harmonicMean geometricMean centralMomentcentralMomentsskewnesskurtosisvarianceUnbiased meanVariancemeanVarianceUnbstdDevvarianceWeighted fastVariancefastVarianceUnbiased fastStdDevExponentialDistributionedLambda exponentialexponentialFromSampleNormalDistributionstandard normalDistrnormalFromSample chebyshevchebyshevBrouckechoose factorial logFactorialincompleteGammalogGamma logGammaLlogBetalog1pBinomialDistributionbdTrials bdProbabilitybinomial ChiSquared chiSquaredNDF chiSquaredGammaDistributiongdShapegdScale gammaDistrHypergeometricDistributionhdMhdLhdKhypergeometricPoissonDistribution poissonLambdapoissonPowerspowersordercountsumwilcoxonRankSums mannWhitneyUmannWhitneyUCriticalValuemannWhitneyUSignificantwilcoxonMatchedPairSignedRankwilcoxonMatchedPairSignificant wilcoxonMatchedPairCriticalValuewilcoxonMatchedPairSignificanceKernel BandwidthPoints fromPointsepanechnikovBW gaussianBW bandwidth choosePointsepanechnikovKernelgaussianKernel estimatePDF simplePDFepanechnikovPDF gaussianPDFEstimateestPoint estLowerBound estUpperBoundestConfidenceLevel bootstrapBCAautocovarianceautocorrelationinlinePerformIOMMdropAtPGDghc-prim GHC.TypesIntDoubleT1TVsqr robustSumVarrobustSumVarWeightedfastVar^EDND ndPdfDenom ndCdfDenomLFBC logChooseFastlogGammaCorrectionBDHDPD SignedRank AbsoluteRankalookup coefficientssummedCoefficients errorShort:<estimate