Copyright | (c) Chris Coffey 2018 |
---|---|
License | MIT |
Maintainer | chris@foldl.io |
Stability | experimental |
Safe Haskell | None |
Language | Haskell2010 |
Cuckoo filters are an alternative data structure to Bloom filters. They use a different approach to hashing items into the filter, which provides different behavior under load.
Inserting an item in to a Bloom filter always succeeds, although as the load factor on the filter increases the false positive probability trends towards %100. Cuckoo filters on the other hand hold the false positive probability constant under load, but will begin to fail inserts.
Cuckoo filters also support deletion natively, which allows reclaiming some used space from the filter to ward off insertion failures.
For more details see: Fan, . Cuckoo Filter: Practically Better Than Bloom. Retrieved August 22, 2016, from https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf
The Cuckoo Filter
A non-zero natural number. Generally this is a power of two, although there's no hard requirement for that given the current implementation.
A Cuckoo Filter with a fixed size. The current implementation uses 8 bit fingerprints and 4 element buckets.
Instances
Eq (Filter a) Source # | |
Show (Filter a) Source # | |
Generic (Filter a) Source # | |
ToJSON (Filter a) Source # | |
Defined in Data.CuckooFilter.Internal | |
FromJSON (Filter a) Source # | |
Serialize (Filter a) Source # | |
type Rep (Filter a) Source # | |
Defined in Data.CuckooFilter.Internal type Rep (Filter a) = D1 (MetaData "Filter" "Data.CuckooFilter.Internal" "cuckoo-filter-0.1.0.2-8NKAT7gypEH8pARQbiuTP3" False) (C1 (MetaCons "F" PrefixI True) (S1 (MetaSel (Just "buckets") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 (IntMap Bucket)) :*: (S1 (MetaSel (Just "numBuckets") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Natural) :*: S1 (MetaSel (Just "size") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Size)))) |
Creates a new & empty Filter
of size s
Working with a Cuckoo Filter
:: Hashable a | |
=> Filter a | Current filter state |
-> a | Item to hash and store in the filter |
-> Maybe (Filter a) |
In exchange for the stable false-positive probability, insertion into a cuckoo filter may fail as the load factor increases.
Amoritized O(1)
Note, because of how cuckoo hashing works, inserts will fail when there's a set of items s that hash to the same fingerprint and share either IndexA or IndexB with probability (2numBuckets * 1256)^ (|s|-1). Alternatively, inserting the same item 2b+1 times will trigger the failure as well.
Checks whether a given item is within the filter.
O(1)