Copyright | (c) Chris Coffey 2018 |
---|---|

License | MIT |

Maintainer | chris@foldl.io |

Stability | experimental |

Safe Haskell | None |

Language | Haskell2010 |

Cuckoo filters are an alternative data structure to Bloom filters. They use a different approach to hashing items into the filter, which provides different behavior under load.

Inserting an item in to a Bloom filter always succeeds, although as the load factor on
the filter increases the false positive probability trends towards *%100*. Cuckoo filters
on the other hand hold the false positive probability constant under load, but will
begin to fail inserts.

Cuckoo filters also support deletion natively, which allows reclaiming some used space from the filter to ward off insertion failures.

For more details see: Fan, . Cuckoo Filter: Practically Better Than Bloom. Retrieved August 22, 2016, from https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf

## Synopsis

- data Size
- makeSize :: Natural -> Maybe Size
- data Filter a
- data MFilter a
- initialize :: CuckooFilter filt m => Size -> m (filt a)
- insert :: (Hashable a, Monad m, CuckooFilter filt m) => filt a -> a -> m (Maybe (filt a))
- member :: (Hashable a, Monad m, CuckooFilter filt m) => a -> filt a -> m Bool
- delete :: (Hashable a, Monad m, CuckooFilter filt m) => filt a -> a -> m (filt a)

# The Cuckoo Filter

A non-zero natural number. Generally this is a power of two, although there's no hard requirement for that given the current implementation.

A Cuckoo Filter with a fixed size. The current implementation uses 8 bit fingerprints and 4 element buckets.

## Instances

Monad m => CuckooFilter Filter m Source # | |

Defined in Data.CuckooFilter.Pure | |

Eq (Filter a) Source # | |

Show (Filter a) Source # | |

Generic (Filter a) Source # | |

ToJSON (Filter a) Source # | |

Defined in Data.CuckooFilter.Pure | |

FromJSON (Filter a) Source # | |

Serialize (Filter a) Source # | |

type Rep (Filter a) Source # | |

Defined in Data.CuckooFilter.Pure type Rep (Filter a) = D1 (MetaData "Filter" "Data.CuckooFilter.Pure" "cuckoo-filter-0.2.0.2-B7QL3ZxFbHGZXnXJGf4MU" False) (C1 (MetaCons "F" PrefixI True) (S1 (MetaSel (Just "buckets") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (Rec0 (IntMap Bucket)) :*: (S1 (MetaSel (Just "numBuckets") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Natural) :*: S1 (MetaSel (Just "size") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Size)))) |

initialize :: CuckooFilter filt m => Size -> m (filt a) Source #

Create a new cuckoo filter of the specified size

# Working with a Cuckoo Filter

:: (Hashable a, Monad m, CuckooFilter filt m) | |

=> filt a | Current filter state |

-> a | Item to hash and store in the filter |

-> m (Maybe (filt a)) |

In exchange for the stable false-positive probability, insertion into a cuckoo filter may fail as the load factor increases.

Amoritized *O(1)*

Note, because of how cuckoo hashing works, inserts will fail when there's a set of items *s*
that hash to the same fingerprint and share either IndexA or IndexB with probability *(2*numBuckets * 1*256)^ (|s|-1)*.
Alternatively, inserting the same item *2b+1* times will trigger the failure as well.

:: (Hashable a, Monad m, CuckooFilter filt m) | |

=> a | Check if this element is in the filter |

-> filt a | The filter |

-> m Bool |

Checks whether a given item is within the filter.

*O(1)*

delete :: (Hashable a, Monad m, CuckooFilter filt m) => filt a -> a -> m (filt a) Source #

Deletes a single occurance of `a`

from the `Filter`

. It first checks for a value
in the primary index, then in the secondary index.

Deleting an element not in the Cuckoo Filter is a noop and returns the filter unchanged.

*O(1)*