structures-0.2: "Advanced" Data Structures

Portabilitynon-portable
Stabilityexperimental
MaintainerEdward Kmett <ekmett@gmail.com>
Safe HaskellNone

Data.Vector.Map

Description

This module provides a Vector-based Map that is loosely based on the Cache Oblivious Lookahead Array (COLA) by Bender et al. from "Cache-Oblivious Streaming B-Trees", but with inserts converted from ephemerally amortized to persisently amortized using a technique from Overmars and van Leeuwen.

Currently this Map is implemented in an insert-only fashion. Deletions are left to future work or to another derived structure in case they prove expensive.

Currently, we also do not use fractional cascading, as it affects the constant factors badly enough to not pay for itself at the scales we are interested in. The naive O(log^2 n) lookup consistently outperforms the alternative.

Compared to the venerable Data.Map, this data structure currently consumes more memory, but it provides a more limited palette of operations with different asymptotics (~10x faster inserts at a million entries) and enables us to utilize contiguous storage.

NB: when used with boxed data this structure may hold onto references to old versions of things for many updates to come until sufficient operations have happened to merge them out of the COLA.

Synopsis

Documentation

data Map k a Source

This Map is implemented as an insert-only Cache Oblivious Lookahead Array (COLA) with amortized complexity bounds that are equal to those of a B-Tree, except for an extra log factor slowdown on lookups due to the lack of fractional cascading. It uses a traditional Data.Map as a nursery.

empty :: Map k vSource

O(1) The empty LA.

null :: Map k v -> BoolSource

O(1). Identify if a LA is the empty LA.

singleton :: (Arrayed k, Arrayed v) => k -> v -> Map k vSource

O(1) Construct a LA from a single key/value pair.

lookup :: (Ord k, Arrayed k, Arrayed v) => k -> Map k v -> Maybe vSource

O(log^2 N) worst-case. Lookup an element.

insert :: (Ord k, Arrayed k, Arrayed v) => k -> v -> Map k v -> Map k vSource

O((log N)/B) amortized loads for each cache. Insert an element.

fromList :: (Ord k, Arrayed k, Arrayed v) => [(k, v)] -> Map k vSource