Copyright | (c) Eben Kadile 2018 |
---|---|

License | BSD 3 Clause |

Maintainer | eben.cowley42@gmail.com |

Stability | experimental |

Safe Haskell | None |

Language | Haskell2010 |

This module provides functions for constructing neighborhood graphs, clique complexes, and Vietoris-Rips complexes, as well as the computation of Betti numbers.

A simplicial complex is like a generalization of a graph, where you can have any number of n-dimensional simplices (line-segments, triangles, tetrahedrons), glued along their boundaries so that the intersection of any two simplices is a simplicial complex. These objects are fundamental to topological data analysis.

An important thing to note about this module, and the library in general, is the difference between "fast" and "light" functions. Fast functions encode the metric in a 2D vector, so that distances don't need to be computed over and over again and can instead be accessed in constant time. Unfortunately, this takes O(n^2) memory so I would strongly recomend against using it for larger data sets unless you have a lot of RAM. Light functions do not use this speed optimization.

The neighborhood graph of point cloud data is a graph where an edge exists between two data points if and only if the vertices fall within a given distance of each other. Graphs here are more of a helper data type for constructing filtrations, which is why they have a somewhat odd representation. Not to worry, though, I've supplied a way of encoding graphs of a more generic representation.

The clique complex of a graph is a simplicial complex whose simplices are the complete subgraphs of the given graph. The Vietoris-Rips complex is the clique complex of the neighborhood graph.

Betti numbers measure the number of n-dimensional holes in a given simplicial complex. The 0th Betti number is the number of connected components, or clusters; the 1st Betti number is the number of loops or tunnels; the 2nd Betti number measures voids or hollow volumes; and if you have high-dimensional data, you might have higher Betti numbers representing the number of high-dimensional holes.

## Synopsis

- type Simplex = (Vector Int, Vector Int)
- type SimplicialComplex = (Int, Vector (Vector Simplex))
- type Graph a = Vector (Vector (a, Bool))
- sc2String :: SimplicialComplex -> String
- getDim :: SimplicialComplex -> Int
- encodeWeightedGraph :: Int -> (Int -> Int -> (a, Bool)) -> Graph a
- wGraph2sc :: Graph a -> SimplicialComplex
- indexGraph :: Graph a -> (Int, Int) -> (a, Bool)
- makeNbrhdGraph :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> Graph a
- makeNbrhdGraphPar :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> Graph a
- makeCliqueComplex :: Graph a -> SimplicialComplex
- makeCliqueComplexPar :: Graph a -> SimplicialComplex
- makeRipsComplexFast :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> (SimplicialComplex, Graph a)
- makeRipsComplexFastPar :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> (SimplicialComplex, Graph a)
- makeRipsComplexLight :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> SimplicialComplex
- makeRipsComplexLightPar :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> SimplicialComplex
- bettiNumbers :: SimplicialComplex -> [Int]
- bettiNumbersPar :: SimplicialComplex -> [Int]

# Types

type Simplex = (Vector Int, Vector Int) Source #

This is type synonym exists to make other synonyms more concise. A simplex is represented as a pair: the vector of its vertices (their index in the original data set), and the vector of the indices of the faces in the next lowest dimension of the simplicial complex. 1-simplices are the exception: they do not store reference to their faces because it would be redundant with their vertices.

type SimplicialComplex = (Int, Vector (Vector Simplex)) Source #

The first component of the pair is the number of vertices. The second component is a vector whose entries are vectors of simplices of the same dimension. Index 0 of the vecor corresponds to dimension 1 because there is no need to store individual vertices.

type Graph a = Vector (Vector (a, Bool)) Source #

This represents the (symmetric) adjacency matrix of some weighted, undirected graph. The type a is whatever distance is in your data analysis procedure. The reason graphs are represented like this is because their main purpose is too speed up the construction of simplicial complexes and filtrations. If the clique complex of this graph were to be constructed, only the adjacencies would matter. But if a filtration is constructed all the distances will be required over and over again - this allows them to be accessed in constant time.

# Utilities

sc2String :: SimplicialComplex -> String Source #

Show all the information in a simplicial complex.

getDim :: SimplicialComplex -> Int Source #

Get the dimension of the highest dimensional simplex (constant time).

encodeWeightedGraph :: Int -> (Int -> Int -> (a, Bool)) -> Graph a Source #

Takes the number of vertices and each edge paired with its weight to output the graph encoded as a 2D vector. If only you have an unweighted graph, you can still encode your graph by simply letting the type a be (). If you have a weighted graph but there isn't a distance between every vertex, you can use the Extended type (essentially the same as Maybe) from the Filtration module which is already an instance of Ord.

wGraph2sc :: Graph a -> SimplicialComplex Source #

Given a weighted graph, construct a 1-dimensional simplicial complex in the canonical way. Betti numbers of this simplicial complex can be used to count cycles and connected components.

indexGraph :: Graph a -> (Int, Int) -> (a, Bool) Source #

Safely index into the adjacency matrix of a graph.

# Construction

makeNbrhdGraph :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> Graph a Source #

The first argument is a scale, the second is a metric, and the third is the data. This function records the distance between every element of the data and whether or not it is smaller than the given scale.

makeNbrhdGraphPar :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> Graph a Source #

Parallel version of the above.

makeCliqueComplex :: Graph a -> SimplicialComplex Source #

Makes a simplicial complex where the simplices are the complete subgraphs (cliques) of the given graph. Mainly a helper function for makeRipsComplexFast, but it might be useful if you happen to have a graph you want to analyze. I highly recomend using the parallel version, as this process is very expensive.

makeCliqueComplexPar :: Graph a -> SimplicialComplex Source #

Parallel version of the above.

makeRipsComplexFast :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> (SimplicialComplex, Graph a) Source #

Constructs the Vietoris-Rips complex given a scale, metric, and data set. Also uses O(n^2) memory (where n is the number of data points) for a graph storing all the distances between data points.

makeRipsComplexFastPar :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> (SimplicialComplex, Graph a) Source #

Parallel version of the above.

makeRipsComplexLight :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> SimplicialComplex Source #

Constructs the Vietoris-Rips complex given a scale, metric, and data set.

makeRipsComplexLightPar :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> SimplicialComplex Source #

Parallel version of the above.

# Homology

bettiNumbers :: SimplicialComplex -> [Int] Source #

The nth index of the output corresponds to the nth Betti number.

The zeroth Betti number is the number of connected components, the 1st is the number of tunnels/ punctures, the 2nd is the number of hollow volumes, and so on for higher-dimensional holes.

bettiNumbersPar :: SimplicialComplex -> [Int] Source #

Calculates all of the Betti numbers in parallel.