cuckoo-filter-0.1.0.1: Pure and impure Cuckoo Filter

Copyright(c) Chris Coffey 2018
LicenseMIT
Maintainerchris@foldl.io
Stabilityexperimental
Safe HaskellNone
LanguageHaskell2010

Data.CuckooFilter

Contents

Description

Cuckoo filters are an alternative data structure to Bloom filters. They use a different approach to hashing items into the filter, which provides different behavior under load.

Inserting an item in to a Bloom filter always succeeds, although as the load factor on the filter increases the false positive probability trends towards %100. Cuckoo filters on the other hand hold the false positive probability constant under load, but will begin to fail inserts.

Cuckoo filters also support deletion natively, which allows reclaiming some used space from the filter to ward off insertion failures.

For more details see: Fan, . Cuckoo Filter: Practically Better Than Bloom. Retrieved August 22, 2016, from https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf

Synopsis

The Cuckoo Filter

data Size Source #

A non-zero natural number. Generally this is a power of two, although there's no hard requirement for that given the current implementation.

Instances
Eq Size Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Methods

(==) :: Size -> Size -> Bool #

(/=) :: Size -> Size -> Bool #

Ord Size Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Methods

compare :: Size -> Size -> Ordering #

(<) :: Size -> Size -> Bool #

(<=) :: Size -> Size -> Bool #

(>) :: Size -> Size -> Bool #

(>=) :: Size -> Size -> Bool #

max :: Size -> Size -> Size #

min :: Size -> Size -> Size #

Show Size Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Methods

showsPrec :: Int -> Size -> ShowS #

show :: Size -> String #

showList :: [Size] -> ShowS #

Generic Size Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Associated Types

type Rep Size :: * -> * #

Methods

from :: Size -> Rep Size x #

to :: Rep Size x -> Size #

ToJSON Size Source # 
Instance details

Defined in Data.CuckooFilter.Internal

FromJSON Size Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Serialize Size Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Methods

put :: Putter Size #

get :: Get Size #

type Rep Size Source # 
Instance details

Defined in Data.CuckooFilter.Internal

type Rep Size = D1 (MetaData "Size" "Data.CuckooFilter.Internal" "cuckoo-filter-0.1.0.1-8pf9DemnDmCCz8s6xcSYnK" True) (C1 (MetaCons "Size" PrefixI False) (S1 (MetaSel (Nothing :: Maybe Symbol) NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 Natural)))

makeSize :: Natural -> Maybe Size Source #

Safely make a Size or fail if a 0 is provided.

data Filter a Source #

A Cuckoo Filter with a fixed size. The current implementation uses 8 bit fingerprints and 4 element buckets.

Instances
Eq (Filter a) Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Methods

(==) :: Filter a -> Filter a -> Bool #

(/=) :: Filter a -> Filter a -> Bool #

Show (Filter a) Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Methods

showsPrec :: Int -> Filter a -> ShowS #

show :: Filter a -> String #

showList :: [Filter a] -> ShowS #

Generic (Filter a) Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Associated Types

type Rep (Filter a) :: * -> * #

Methods

from :: Filter a -> Rep (Filter a) x #

to :: Rep (Filter a) x -> Filter a #

ToJSON (Filter a) Source # 
Instance details

Defined in Data.CuckooFilter.Internal

FromJSON (Filter a) Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Serialize (Filter a) Source # 
Instance details

Defined in Data.CuckooFilter.Internal

Methods

put :: Putter (Filter a) #

get :: Get (Filter a) #

type Rep (Filter a) Source # 
Instance details

Defined in Data.CuckooFilter.Internal

type Rep (Filter a) = D1 (MetaData "Filter" "Data.CuckooFilter.Internal" "cuckoo-filter-0.1.0.1-8pf9DemnDmCCz8s6xcSYnK" False) (C1 (MetaCons "F" PrefixI True) (S1 (MetaSel (Just "buckets") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 (IntMap Bucket)) :*: (S1 (MetaSel (Just "numBuckets") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Natural) :*: S1 (MetaSel (Just "size") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Size))))

empty Source #

Arguments

:: Size

The initial size of the filter

-> Filter a 

Creates a new & empty Filter of size s

Working with a Cuckoo Filter

insert Source #

Arguments

:: Hashable a 
=> Filter a

Current filter state

-> a

Item to hash and store in the filter

-> Maybe (Filter a) 

In exchange for the stable false-positive probability, insertion into a cuckoo filter may fail as the load factor increases.

Amoritized O(1)

Note, because of how cuckoo hashing works, inserts will fail when there's a set of items s that hash to the same fingerprint and share either IndexA or IndexB with probability (2numBuckets * 1256)^ (|s|-1). Alternatively, inserting the same item 2b+1 times will trigger the failure as well.

member Source #

Arguments

:: Hashable a 
=> a

Check if this element is in the filter

-> Filter a

The filter

-> Bool 

Checks whether a given item is within the filter.

O(1)

delete :: Hashable a => Filter a -> a -> Filter a Source #

Deletes a single occurance of a from the Filter. It first checks for a value in the primary index, then in the secondary index.

Deleting an element not in the Cuckoo Filter is a noop and returns the filter unchanged.

O(1)