An implementation of Tarjan's UNION-FIND algorithm. (Robert E Tarjan. "Efficiency of a Good But Not Linear Set Union Algorithm", JACM 22(2), 1975)

The algorithm implements three operations efficiently (all amortised
`O(1)`

):

- Check whether two elements are in the same equivalence class.
- Create a union of two equivalence classes.
- Look up the descriptor of the equivalence class.

The implementation is based on mutable references. Each equivalence class has exactly one member that serves as its representative element. Every element either is the representative element of its equivalence class or points to another element in the same equivalence class. Equivalence testing thus consists of following the pointers to the representative elements and then comparing these for identity.

The algorithm performs lazy path compression. That is, whenever we walk along a path greater than length 1 we automatically update the pointers along the path to directly point to the representative element. Consequently future lookups will be have a path length of at most 1.

- data Point a
- fresh :: a -> IO (Point a)
- repr :: Point a -> IO (Point a)
- union :: Point a -> Point a -> IO ()
- union' :: Point a -> Point a -> (a -> a -> IO a) -> IO ()
- equivalent :: Point a -> Point a -> IO Bool
- redundant :: Point a -> IO Bool
- descriptor :: Point a -> IO a
- setDescriptor :: Point a -> a -> IO ()
- modifyDescriptor :: Point a -> (a -> a) -> IO ()

# Documentation

The abstract type of an element of the sets we work on. It is parameterised over the type of the descriptor.

fresh :: a -> IO (Point a)Source

*O(1)*. Create a fresh point and return it. A fresh point is in
the equivalence class that contains only itself.

repr :: Point a -> IO (Point a)Source

*O(1)*. `repr point`

returns the representative point of
`point`

's equivalence class.

This method performs the path compresssion.

union :: Point a -> Point a -> IO ()Source

*O(1)*. Join the equivalence classes of the points (which must be
distinct). The resulting equivalence class will get the descriptor
of the second argument.

union' :: Point a -> Point a -> (a -> a -> IO a) -> IO ()Source

Like `union`

, but sets the descriptor returned from the callback.

The intention is to keep the descriptor of the second argument to the callback, but the callback might adjust the information of the descriptor or perform side effects.

equivalent :: Point a -> Point a -> IO BoolSource

*O(1)*. Return `True`

if both points belong to the same
| equivalence class.

redundant :: Point a -> IO BoolSource

*O(1)*. Returns `True`

for all but one element of an equivalence
class. That is, if `ps = [p1, .., pn]`

are all in the same
equivalence class, then the following assertion holds.

do rs <- mapM redundant ps assert (length (filter (==False) rs) == 1)

It is unspecified for which element function returns `False`

, so be
really careful when using this.

descriptor :: Point a -> IO aSource

*O(1)*. Return the descriptor associated with argument point's
equivalence class.

setDescriptor :: Point a -> a -> IO ()Source

*O(1)*. Replace the descriptor of the point's equivalence class
with the second argument.

modifyDescriptor :: Point a -> (a -> a) -> IO ()Source