Safe Haskell	None
Language	Haskell2010

Data.PerfectHash.Construction

Description

Constructs a minimal perfect hash from a map of key-value pairs.

Overview of algorithm

A two-input hash function F(nonce, key) is used.

Keys are hashed into buckets for the first round with a nonce of 0.
Iterating over each bucket of size >= 2 in order of decreasing size, keep testing different nonce values until all members of the bucket fall into open slots in the final array. When a successful nonce is found, write it to the "intermediate" array at the bucket's position.
For each bucket of size 1, select an arbitrary open slot in the final array, and write the slot's index (after negation and subtracting 1) to the intermediate array.

According to this paper, the algorithm is assured to run in linear time.

Provenance

This implementation was adapted from Steve Hanov's Blog. A refactoring of that Python implementation may be found here. This Haskell implementation was transliterated and evolved from that refactoring.

Synopsis

createMinimalPerfectHash :: (ToHashableChunks k, Default v) => Map k v -> LookupTable v

Documentation

createMinimalPerfectHash Source #

Arguments

:: (ToHashableChunks k, Default v)
=> Map k v	key-value pairs
-> LookupTable v	output for use by `lookup` or a custom code generator

Generates a minimal perfect hash for a set of key-value pairs.

The keys must be instances of ToHashableChunks. The values may be of arbitrary type.

A Map is required as input to guarantee that there are no duplicate keys.