union-find-0.2: Efficient union and equivalence testing of sets.

Safe HaskellSafe-Infered



An implementation of Tarjan's UNION-FIND algorithm. (Robert E Tarjan. "Efficiency of a Good But Not Linear Set Union Algorithm", JACM 22(2), 1975)

The algorithm implements three operations efficiently (all amortised O(1)):

  1. Check whether two elements are in the same equivalence class.
  2. Create a union of two equivalence classes.
  3. Look up the descriptor of the equivalence class.

The implementation is based on mutable references. Each equivalence class has exactly one member that serves as its representative element. Every element either is the representative element of its equivalence class or points to another element in the same equivalence class. Equivalence testing thus consists of following the pointers to the representative elements and then comparing these for identity.

The algorithm performs lazy path compression. That is, whenever we walk along a path greater than length 1 we automatically update the pointers along the path to directly point to the representative element. Consequently future lookups will be have a path length of at most 1.



data Point a Source

The abstract type of an element of the sets we work on. It is parameterised over the type of the descriptor.


Eq (Point a) 

fresh :: a -> IO (Point a)Source

O(1). Create a fresh point and return it. A fresh point is in the equivalence class that contains only itself.

repr :: Point a -> IO (Point a)Source

O(1). repr point returns the representative point of point's equivalence class.

This method performs the path compresssion.

union :: Point a -> Point a -> IO ()Source

O(1). Join the equivalence classes of the points (which must be distinct). The resulting equivalence class will get the descriptor of the second argument.

union' :: Point a -> Point a -> (a -> a -> IO a) -> IO ()Source

Like union, but sets the descriptor returned from the callback.

The intention is to keep the descriptor of the second argument to the callback, but the callback might adjust the information of the descriptor or perform side effects.

equivalent :: Point a -> Point a -> IO BoolSource

O(1). Return True if both points belong to the same | equivalence class.

redundant :: Point a -> IO BoolSource

O(1). Returns True for all but one element of an equivalence class. That is, if ps = [p1, .., pn] are all in the same equivalence class, then the following assertion holds.

 do rs <- mapM redundant ps
    assert (length (filter (==False) rs) == 1)

It is unspecified for which element function returns False, so be really careful when using this.

descriptor :: Point a -> IO aSource

O(1). Return the descriptor associated with argument point's equivalence class.

setDescriptor :: Point a -> a -> IO ()Source

O(1). Replace the descriptor of the point's equivalence class with the second argument.

modifyDescriptor :: Point a -> (a -> a) -> IO ()Source