Safe Haskell | None |
---|---|
Language | Haskell2010 |
The Relative Error Quantile (REQ) sketch provides extremely high accuracy at a chosen end of the rank domain. This is best illustrated with some rank domain accuracy plots that compare the KLL quantiles sketch to the REQ sketch.
This first plot illustrates the typical error behavior of the KLL sketch (also the quantiles/DoublesSketch). The error is flat for all ranks (0, 1). The green and yellow lines correspond to +/- one RSE at 68% confidence; the blue and red lines, +- two RSE at 95% confidence; and, the purple and brown lines +- 3 RSE at 99% confidence. The reason all the curves pinch at 0 and 1.0, is because the sketch knows with certainty that a request for a quantile at rank = 0 is the minimum value of the stream; and a request for a quantiles at rank = 1.0, is the maximum value of the stream. Both of which the sketch tracks.
The next plot is the exact same data and queries fed to the REQ sketch set for High Rank Accuracy (HRA) mode. In this plot, starting at a rank of about 0.3, the contour lines start converging and actually reach zero error at rank 1.0. Therefore the error (the inverse of accuracy) is relative to the requested rank, thus the name of the sketch. This means that the user can perform getQuantile(rank) queries, where rank = .99999 and get accurate results.
This next plot is also the same data and queries, except the REQ sketch was configured for Low Rank Accuracy (LRA). In this case the user can perform getQuantiles(rank) queries, where rank = .00001 and get accurate results.
Synopsis
- data ReqSketch (k :: Nat) s = ReqSketch {
- rankAccuracySetting :: !RankAccuracy
- criterion :: !Criterion
- sketchRng :: !(Gen s)
- totalN :: !(URef s Word64)
- minValue :: !(URef s Double)
- maxValue :: !(URef s Double)
- sumValue :: !(URef s Double)
- retainedItems :: !(URef s Int)
- maxNominalCapacitiesSize :: !(URef s Int)
- aux :: !(MutVar s (Maybe ReqAuxiliary))
- compactors :: !(MutVar s (Vector (ReqCompactor s)))
- type ValidK k = (4 <= k, k <= 1024, (k `Mod` 2) ~ 0)
- mkReqSketch :: forall k m. (PrimMonad m, ValidK k, KnownNat k) => RankAccuracy -> m (ReqSketch k (PrimState m))
- getN :: PrimMonad m => ReqSketch k (PrimState m) -> m Word64
- getSum :: PrimMonad m => ReqSketch k (PrimState m) -> m Double
- getCompactors :: PrimMonad m => ReqSketch k (PrimState m) -> m (Vector (ReqCompactor (PrimState m)))
- getNumLevels :: PrimMonad m => ReqSketch k (PrimState m) -> m Int
- getRetainedItems :: PrimMonad m => ReqSketch k (PrimState m) -> m Int
- computeTotalRetainedItems :: PrimMonad m => ReqSketch k (PrimState m) -> m Int
- cumulativeDistributionFunction :: (PrimMonad m, KnownNat k) => ReqSketch k (PrimState m) -> [Double] -> m (Maybe [Double])
- data RankAccuracy
- rankAccuracy :: ReqSketch s k -> RankAccuracy
- relativeStandardError :: ReqSketch s k -> Int -> Double -> RankAccuracy -> Int -> Double
- maximum :: PrimMonad m => ReqSketch k (PrimState m) -> m Double
- count :: (PrimMonad m, s ~ PrimState m, KnownNat k) => ReqSketch k s -> Double -> m Word64
- probabilityMassFunction :: (PrimMonad m, KnownNat k) => ReqSketch k (PrimState m) -> [Double] -> m [Double]
- quantile :: (PrimMonad m, KnownNat k) => ReqSketch k (PrimState m) -> Double -> m Double
- quantiles :: (PrimMonad m, KnownNat k) => ReqSketch k (PrimState m) -> [Double] -> m [Double]
- rank :: (PrimMonad m, KnownNat k) => ReqSketch k (PrimState m) -> Double -> m Double
- rankLowerBound :: (PrimMonad m, KnownNat k) => ReqSketch k (PrimState m) -> Double -> Int -> m Double
- ranks :: (PrimMonad m, s ~ PrimState m, KnownNat k) => ReqSketch k s -> [Double] -> m [Double]
- rankUpperBound :: (PrimMonad m, KnownNat k) => ReqSketch k (PrimState m) -> Double -> Int -> m Double
- null :: PrimMonad m => ReqSketch k (PrimState m) -> m Bool
- isEstimationMode :: PrimMonad m => ReqSketch k (PrimState m) -> m Bool
- isLessThanOrEqual :: ReqSketch s k -> Bool
- merge :: (PrimMonad m, s ~ PrimState m, KnownNat k) => ReqSketch k s -> ReqSketch k s -> m (ReqSketch k s)
- update :: (PrimMonad m, KnownNat k) => ReqSketch k (PrimState m) -> Double -> m ()
- data CumulativeDistributionInvariants
- mkAuxiliaryFromReqSketch :: PrimMonad m => ReqSketch k (PrimState m) -> m ReqAuxiliary
Documentation
data ReqSketch (k :: Nat) s Source #
ReqSketch | |
|
Instances
mkReqSketch :: forall k m. (PrimMonad m, ValidK k, KnownNat k) => RankAccuracy -> m (ReqSketch k (PrimState m)) Source #
getCompactors :: PrimMonad m => ReqSketch k (PrimState m) -> m (Vector (ReqCompactor (PrimState m))) Source #
cumulativeDistributionFunction Source #
:: (PrimMonad m, KnownNat k) | |
=> ReqSketch k (PrimState m) | |
-> [Double] | Returns an approximation to the Cumulative Distribution Function (CDF), which is the cumulative analog of the PMF, of the input stream given a set of splitPoint (values). The resulting approximations have a probabilistic guarantee that be obtained, a priori, from the getRSE(int, double, boolean, long) function. If the sketch is empty this returns |
-> m (Maybe [Double]) |
Returns an approximation to the Cumulative Distribution Function (CDF), which is the cumulative analog of the PMF, of the input stream given a set of splitPoint (values).
data RankAccuracy Source #
HighRanksAreAccurate | High ranks are prioritized for better accuracy. |
LowRanksAreAccurate | Low ranks are prioritized for better accuracy |
Instances
Eq RankAccuracy Source # | |
Defined in DataSketches.Quantiles.RelativeErrorQuantile.Types (==) :: RankAccuracy -> RankAccuracy -> Bool # (/=) :: RankAccuracy -> RankAccuracy -> Bool # | |
Show RankAccuracy Source # | |
Defined in DataSketches.Quantiles.RelativeErrorQuantile.Types showsPrec :: Int -> RankAccuracy -> ShowS # show :: RankAccuracy -> String # showList :: [RankAccuracy] -> ShowS # |
rankAccuracy :: ReqSketch s k -> RankAccuracy Source #
relativeStandardError Source #
:: ReqSketch s k | |
-> Int | k - the given value of k |
-> Double | rank - the given normalized rank, a number in [0,1]. |
-> RankAccuracy | |
-> Int | totalN - an estimate of the total number of items submitted to the sketch. |
-> Double | an a priori estimate of relative standard error (RSE, expressed as a number in [0,1]). |
Returns an a priori estimate of relative standard error (RSE, expressed as a number in [0,1]). Derived from Lemma 12 in https://arxiv.org/abs/2004.01668v2, but the constant factors were modified based on empirical measurements.
maximum :: PrimMonad m => ReqSketch k (PrimState m) -> m Double Source #
Gets the largest value seen by this sketch
probabilityMassFunction Source #
:: (PrimMonad m, KnownNat k) | |
=> ReqSketch k (PrimState m) | |
-> [Double] | splitPoints - an array of m unique, monotonically increasing double values that divide the real number line into m+1 consecutive disjoint intervals. The definition of an "interval" is inclusive of the left splitPoint (or minimum value) and exclusive of the right splitPoint, with the exception that the last interval will include the maximum value. It is not necessary to include either the min or max values in these splitpoints. |
-> m [Double] | An array of m+1 doubles each of which is an approximation to the fraction of the input stream values (the mass) that fall into one of those intervals. The definition of an "interval" is inclusive of the left splitPoint and exclusive of the right splitPoint, with the exception that the last interval will include maximum value. |
Returns an approximation to the Probability Mass Function (PMF) of the input stream given a set of splitPoints (values). The resulting approximations have a probabilistic guarantee that be obtained, a priori, from the getRSE(int, double, boolean, long) function.
If the sketch is empty this returns an empty list.
:: (PrimMonad m, KnownNat k) | |
=> ReqSketch k (PrimState m) | |
-> Double | normRank - the given normalized rank |
-> m Double | the approximate quantile given the normalized rank. |
Gets the approximate quantile of the given normalized rank based on the lteq criterion.
:: (PrimMonad m, KnownNat k) | |
=> ReqSketch k (PrimState m) | |
-> [Double] | normRanks - the given array of normalized ranks. |
-> m [Double] | the array of quantiles that correspond to the given array of normalized ranks. |
Gets an array of quantiles that correspond to the given array of normalized ranks.
:: (PrimMonad m, KnownNat k) | |
=> ReqSketch k (PrimState m) | |
-> Double | value - the given value |
-> m Double | the normalized rank of the given value in the stream. |
Computes the normalized rank of the given value in the stream. The normalized rank is the fraction of values less than the given value; or if lteq is true, the fraction of values less than or equal to the given value.
:: (PrimMonad m, KnownNat k) | |
=> ReqSketch k (PrimState m) | |
-> Double | rank - the given rank, a value between 0 and 1.0. |
-> Int | numStdDev - the number of standard deviations. Must be 1, 2, or 3. |
-> m Double | an approximate lower bound rank. |
Returns an approximate lower bound rank of the given normalized rank.
ranks :: (PrimMonad m, s ~ PrimState m, KnownNat k) => ReqSketch k s -> [Double] -> m [Double] Source #
Gets an array of normalized ranks that correspond to the given array of values. TODO, make it ifaster
:: (PrimMonad m, KnownNat k) | |
=> ReqSketch k (PrimState m) | |
-> Double | rank - the given rank, a value between 0 and 1.0. |
-> Int | numStdDev - the number of standard deviations. Must be 1, 2, or 3. |
-> m Double | an approximate upper bound rank. |
Returns an approximate upper bound rank of the given rank.
null :: PrimMonad m => ReqSketch k (PrimState m) -> m Bool Source #
Returns true if this sketch is empty.
isEstimationMode :: PrimMonad m => ReqSketch k (PrimState m) -> m Bool Source #
Returns true if this sketch is in estimation mode.
isLessThanOrEqual :: ReqSketch s k -> Bool Source #
Returns the current comparison criterion.
merge :: (PrimMonad m, s ~ PrimState m, KnownNat k) => ReqSketch k s -> ReqSketch k s -> m (ReqSketch k s) Source #
Merge other sketch into this one.
update :: (PrimMonad m, KnownNat k) => ReqSketch k (PrimState m) -> Double -> m () Source #
Updates this sketch with the given item.
data CumulativeDistributionInvariants Source #
CumulativeDistributionInvariantsSplitsAreEmpty | |
CumulativeDistributionInvariantsSplitsAreNotFinite | |
CumulativeDistributionInvariantsSplitsAreNotUniqueAndMontonicallyIncreasing |
mkAuxiliaryFromReqSketch :: PrimMonad m => ReqSketch k (PrimState m) -> m ReqAuxiliary Source #