-- Hoogle documentation, generated by Haddock
-- See Hoogle, http://www.haskell.org/hoogle/


-- | Quickly detect clusters and holes in data.
--   
--   Persistence is a topological data analysis library motivated by
--   flexibility when it comes to the type of data being analyzed. If you
--   have data that comes with a meaningful function into something of the
--   Ord typeclass, you can use Persistence to detect clusters and holes in
--   the data. You can also use the library to analyze undirected graphs,
--   and interesting features for directed graphs will be added soon.
@package Persistence
@version 1.0


-- | This module provides functions for constructing neighborhood graphs,
--   clique complexes, Vietoris-Rips complexes, as well as the computation
--   of Betti numbers.
--   
--   An important thing to know about this module is the difference between
--   "fast" and "light" functions. Fast functions encode the metric in a 2D
--   vector, so that distances don't need to be computed over and over
--   again and can instead be accessed in constant time. Unfortunately,
--   this takes O(n^2) memory so I would strongly recomend against using it
--   for larger data sets; this is why "light" functions exist.
--   
--   A neighborhood graph is a graph where an edge exists between two
--   vertices if and only if the vertices fall within a given distance of
--   each other. Graphs here are more of a helper data type for
--   constructing filtrations, which is why they have a somewhat odd
--   representation.
--   
--   The clique complex of a graph is a simplicial complex whose simplices
--   are the complete subgraphs of the given graph. The Vietoris-Rips
--   complex is the clique complex of the neighborhood graph.
--   
--   Betti numbers measure the number of n-dimensional holes in a given
--   simplicial complex. The 0th Betti number is the number of connected
--   components, or clusters; the 1st Betti number is the number of loops
--   or tunnels; then 2nd Betti number measures voids or hollow volumes;
--   and if you have high-dimensional data, you might have higher Betti
--   numbers representing the number of high-dimensional holes.
module SimplicialComplex

-- | The first component of the pair is the number of vertices. The second
--   component is a vector whose entries are vectors of simplices of the
--   same dimension. The dimension starts at 1 because there is no need to
--   store individual vertices. A simplex is represented as a pair: the
--   vector of its vertices in the original data set from which the complex
--   was constructed, and the vector of the indices of the faces in the
--   next lowest dimension. Edges are the exception - they do not store
--   reference to their faces because it would be redundant with their
--   vertices.
type SimplicialComplex = (Int, Vector (Vector (Vector Int, Vector Int)))

-- | This represents the (symmetric) adjacency matrix of some undirected
--   graph. The type <tt>a</tt> is whatever distance is in your data
--   analysis regime. The reason graphs are represented like this is
--   because their main function is too speed up the construction of
--   simplicial complexes and filtrations. If the clique complex of this
--   graph were to be constructed, only the adjacencies would matter. But
--   if a filtration is constructed all the distances will be required over
--   and over again - this allows them to be accessed in constant time.
type Graph a = Vector (Vector (a, Bool))

-- | Show all the information in a simplicial complex.
sc2String :: SimplicialComplex -> String

-- | Get the dimension of the highest dimensional simplex (constant time).
getDim :: SimplicialComplex -> Int

-- | The first argument is a scale, the second is a metric, and the third
--   is the data. This function records the distance between every element
--   of the data and whether or not it is smaller than the given scale.
makeNbrhdGraph :: (Ord a, Eq b) => a -> (b -> b -> a) -> [b] -> Graph a

-- | Makes a simplicial complex where the simplices are the complete
--   subgraphs (cliques) of the given graph. Mainly a helper function for
--   makeVRComplexFast, but it might be useful if you happen to have a
--   graph you want to analyze. This utilizes any available processors in
--   parallel because the construction is quite expensive.
makeCliqueComplex :: Graph a -> SimplicialComplex

-- | Constructs the Vietoris-Rips complex given a scale, metric, and data
--   set. Also uses O(n^2) memory (where n is the number of data points)
--   for a graph storing all the distances between data points. This
--   utilizes any available processors in parallel because the construction
--   is quite expensive.
makeVRComplexFast :: (Ord a, Eq b) => a -> (b -> b -> a) -> [b] -> (SimplicialComplex, Graph a)

-- | Constructs the Vietoris-Rips complex given a scale, metric, and data
--   set. This utilizes any available processors in parallel because the
--   construction is quite expensive.
makeVRComplexLight :: (Ord a, Eq b) => a -> (b -> b -> a) -> [b] -> SimplicialComplex

-- | The nth index of the output corresponds to the nth Betti number. The
--   zeroth Betti number is the number of connected components, the 1st is
--   the number of tunnels/punctures, the 2nd is the number of hollow
--   volumes, and so on for higher-dimensional holes.
bettiNumbers :: SimplicialComplex -> [Int]

-- | Calculates all of the Betti numbers in parallel.
bettiNumbersPar :: SimplicialComplex -> [Int]


-- | This module contains functions for constructing filtrations and
--   computing persistent homology, as well as a few utility functions for
--   working with filtrations.
--   
--   A filtration is a finite sequence of simplicial complexes where each
--   complex is a subset of the next. This means that a filtration can be
--   thought of as a single simplicial complex where each of the simplices
--   is labeled with a "filtration index" that represents the index in the
--   sequence where that simplex enters the filtration.
--   
--   One way to create a filtration, given a simplicial complex, a metric
--   for the vertices, and a list of distances, is to loop through the
--   distances from greatest to least: create a simplicial complex each
--   iteration which excludes simplices that contain pairs of vertices
--   which are further than the current distance apart. This method will
--   produce a filtration of Vietoris-Rips complexes - each filtration
--   index will correspond to a VR complex whose scale is the corresponding
--   distance.
--   
--   An essential thing to note about the way this library is set up is the
--   distinction between "fast" and "light" functions. Light functions call
--   the metric every time distance between two points is required, which
--   is a lot. Fast functions store the distances between points and access
--   them in constant time, BUT this means they use O(n^2) memory with
--   respect to the number of data points, so it's a really bad idea to use
--   this optimization on substantially large data.
--   
--   IMPORTANT NOTE: This library currently only handles filtrations where
--   all the vertices have filtration index 0. Since I haven't thought of
--   any way filtrations that don't respect this property could come up in
--   applications, I have chosen to exclude them in favor of less memory
--   usage and a slight speedup.
--   
--   Persistent homology is the main event of topological data analysis. It
--   allows one to identify clusters, holes, and voids that persist in the
--   data throughout many scales. The output of the persistence algorithm
--   is a barcode diagram, which currently have a somewhat crude
--   representation in this library. A single barcode represents the
--   filtration index where a feature appears and the index where it
--   disappears (if it does). Thus, short barcodes are typically
--   interpretted as noise and long barcodes are interpretted as actual
--   features.
module Persistence

-- | The first component is simply the number of vertices (all vertices are
--   assumed to have filtration index 0). The second component is a vector
--   with an entry for each dimension of simplices, starting at dimension 1
--   for edges. Each simplex is represented as a triple: its filtration
--   index, the indices of its vertices in the original data, and the
--   indices of its faces in the next lowest dimension. Edges do not have
--   reference to their faces, as it would be redundant with their
--   vertices. All simplices are sorted according to filtration index upon
--   construction of the filtration.
type Filtration = (Int, Vector (Vector (Int, Vector Int, Vector Int)))

-- | (i, Just j) is a feature that appears at filtration index i and
--   disappears at index j, (i, Nothing) begins at i and doesn't disappear.
type BarCode = (Int, Maybe Int)

-- | Shows all the information in a simplex.
sim2String :: (Int, Vector Int, Vector Int) -> String

-- | Shows all the information in a filtration.
filtr2String :: Filtration -> String

-- | Gets the simplicial complex specified by the filtration index. This is
--   O(n) with respect to the number of simplices.
getComplex :: Int -> Filtration -> SimplicialComplex

-- | The first argument is a list of dimensions, the second argument is a
--   list of filtration indices. The function returns the number of
--   simplices in the filtration whose dimension and index exist in the
--   respective lists.
getNumSimplices :: [Int] -> [Int] -> Filtration -> Int

-- | Return the dimension of the highest dimensional simplex in the
--   filtration (constant time).
getDimension :: Filtration -> Int

-- | Given a list of scales, a simplicial complex, and a weighted graph
--   (see SimplicialComplex) which encodes a metric on the vertices, this
--   function creates a filtration out of a simplicial complex by removing
--   simplices that contain edges that are too long for each scale in the
--   list. Really, this is a helper function to be called by
--   makeVRFiltrationFast, but I decided to expose it in case you have a
--   simplicial complex and weighted graph lying around. The scales MUST be
--   in decreasing order for this function.
filterByWeightsFast :: Ord a => [a] -> (SimplicialComplex, Graph a) -> Filtration

-- | Given a list of scales, a metric, and a data set, this function
--   constructs a filtration of the Vietoris-Rips complexes associated with
--   the scales. The scales MUST be in decreasing order. Note that this a
--   fast function, meaning it uses O(n^2) memory to quickly access
--   distances where n is the number of data points.
makeVRFiltrationFast :: (Ord a, Eq b) => [a] -> (b -> b -> a) -> [b] -> Filtration

-- | The same as filterbyWeightsFast except it uses far less memory at the
--   cost of speed. Note that the scales must be in decreasing order.
filterByWeightsLight :: Ord a => [a] -> (b -> b -> a) -> [b] -> SimplicialComplex -> Filtration

-- | Given a list of scales in decreasing order, a metric, and a data set,
--   this constructs the filtration of Vietoris-Rips complexes
--   corresponding to the scales.
makeVRFiltrationLight :: (Ord a, Eq b) => [a] -> (b -> b -> a) -> [b] -> Filtration

-- | Each index in the list is a list of barcodes whose dimension
--   corresponds to the index. That is, the first list will represent
--   clusters, the second list will represent tunnels or punctures, the
--   third will represent hollow volumes, and the nth index list will
--   represent n-dimensional holes in the data.
persistentHomology :: Filtration -> [[BarCode]]