-- Hoogle documentation, generated by Haddock
-- See Hoogle, http://www.haskell.org/hoogle/


-- | A versatile library for topological data anlysis.
--   
--   A topological data anlysis library motivated by flexibility when it
--   comes to the type of data being analyzed. If your data comes with a
--   meaningful binary function into into an ordered set, you can use
--   Persistence to analyze your data. The library also provides functions
--   for analyzing directed/undirected, weighted/unweighted graphs. See the
--   README for resources on learning about topological data anlysis.
@package Persistence
@version 2.0


-- | This module provides functions for constructing neighborhood graphs,
--   clique complexes, and Vietoris-Rips complexes, as well as the
--   computation of Betti numbers.
--   
--   A simplicial complex is like a generalization of a graph, where you
--   can have any number of n-dimensional simplices (line-segments,
--   triangles, tetrahedrons), glued along their boundaries so that the
--   intersection of any two simplices is a simplicial complex. These
--   objects are fundamental to topological data analysis.
--   
--   An important thing to note about this module, and the library in
--   general, is the difference between "fast" and "light" functions. Fast
--   functions encode the metric in a 2D vector, so that distances don't
--   need to be computed over and over again and can instead be accessed in
--   constant time. Unfortunately, this takes O(n^2) memory so I would
--   strongly recomend against using it for larger data sets unless you
--   have a lot of RAM. <a>Light</a> functions do not use this speed
--   optimization.
--   
--   The neighborhood graph of point cloud data is a graph where an edge
--   exists between two data points if and only if the vertices fall within
--   a given distance of each other. Graphs here are more of a helper data
--   type for constructing filtrations, which is why they have a somewhat
--   odd representation. Not to worry, though, I've supplied a way of
--   encoding graphs of a more generic representation.
--   
--   The clique complex of a graph is a simplicial complex whose simplices
--   are the complete subgraphs of the given graph. The Vietoris-Rips
--   complex is the clique complex of the neighborhood graph.
--   
--   Betti numbers measure the number of n-dimensional holes in a given
--   simplicial complex. The 0th Betti number is the number of connected
--   components, or clusters; the 1st Betti number is the number of loops
--   or tunnels; the 2nd Betti number measures voids or hollow volumes; and
--   if you have high-dimensional data, you might have higher Betti numbers
--   representing the number of high-dimensional holes.
module Persistence.SimplicialComplex

-- | This is type synonym exists to make other synonyms more concise. A
--   simplex is represented as a pair: the vector of its vertices (their
--   index in the original data set), and the vector of the indices of the
--   faces in the next lowest dimension of the simplicial complex.
--   1-simplices are the exception: they do not store reference to their
--   faces because it would be redundant with their vertices.
type Simplex = (Vector Int, Vector Int)

-- | The first component of the pair is the number of vertices. The second
--   component is a vector whose entries are vectors of simplices of the
--   same dimension. Index 0 of the vecor corresponds to dimension 1
--   because there is no need to store individual vertices.
type SimplicialComplex = (Int, Vector (Vector Simplex))

-- | This represents the (symmetric) adjacency matrix of some weighted,
--   undirected graph. The type a is whatever distance is in your data
--   analysis procedure. The reason graphs are represented like this is
--   because their main purpose is too speed up the construction of
--   simplicial complexes and filtrations. If the clique complex of this
--   graph were to be constructed, only the adjacencies would matter. But
--   if a filtration is constructed all the distances will be required over
--   and over again - this allows them to be accessed in constant time.
type Graph a = Vector (Vector (a, Bool))

-- | Show all the information in a simplicial complex.
sc2String :: SimplicialComplex -> String

-- | Get the dimension of the highest dimensional simplex (constant time).
getDim :: SimplicialComplex -> Int

-- | Takes the number of vertices and each edge paired with its weight to
--   output the graph encoded as a 2D vector. If only you have an
--   unweighted graph, you can still encode your graph by simply letting
--   the type a be (). If you have a weighted graph but there isn't a
--   distance between every vertex, you can use the Extended type
--   (essentially the same as Maybe) from the Filtration module which is
--   already an instance of Ord.
encodeWeightedGraph :: Int -> (Int -> Int -> (a, Bool)) -> Graph a

-- | Given a weighted graph, construct a 1-dimensional simplicial complex
--   in the canonical way. Betti numbers of this simplicial complex can be
--   used to count cycles and connected components.
wGraph2sc :: Graph a -> SimplicialComplex

-- | The first argument is a scale, the second is a metric, and the third
--   is the data. This function records the distance between every element
--   of the data and whether or not it is smaller than the given scale.
makeNbrhdGraph :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> Graph a

-- | Parallel version of the above.
makeNbrhdGraphPar :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> Graph a

-- | Makes a simplicial complex where the simplices are the complete
--   subgraphs (cliques) of the given graph. Mainly a helper function for
--   makeRipsComplexFast, but it might be useful if you happen to have a
--   graph you want to analyze. I highly recomend using the parallel
--   version, as this process is very expensive.
makeCliqueComplex :: Graph a -> SimplicialComplex

-- | Parallel version of the above.
makeCliqueComplexPar :: Graph a -> SimplicialComplex

-- | Constructs the Vietoris-Rips complex given a scale, metric, and data
--   set. Also uses O(n^2) memory (where n is the number of data points)
--   for a graph storing all the distances between data points.
makeRipsComplexFast :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> (SimplicialComplex, Graph a)

-- | Parallel version of the above.
makeRipsComplexFastPar :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> (SimplicialComplex, Graph a)

-- | Constructs the Vietoris-Rips complex given a scale, metric, and data
--   set.
makeRipsComplexLight :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> SimplicialComplex

-- | Parallel version of the above.
makeRipsComplexLightPar :: (Ord a, Eq b) => a -> (b -> b -> a) -> Either (Vector b) [b] -> SimplicialComplex

-- | The nth index of the output corresponds to the nth Betti number.
--   
--   The zeroth Betti number is the number of connected components, the 1st
--   is the number of tunnels/ punctures, the 2nd is the number of hollow
--   volumes, and so on for higher-dimensional holes.
bettiNumbers :: SimplicialComplex -> [Int]

-- | Calculates all of the Betti numbers in parallel.
bettiNumbersPar :: SimplicialComplex -> [Int]


-- | This module allows one to do computations involving directed graphs.
--   Namely, it allows you to convert a directed graph (presented in a
--   generic way) into a simplicial complex and provides functions for
--   constructing the "directed clique complex," see below.
--   
--   This module uses algorithms for admissible Hasse diagrams. A Hasse
--   diagram is admissible if it is stratified and oriented. A diagram is
--   stratified if all the vertices can be arranged in rows such that all
--   the sources of each vertex are in the next highest row and all the
--   targets are in the next lowest row. A diagram is oriented if every
--   vertex has a linear ordering on its targets.
--   
--   A node in the diagram is represented as a tuple: the indices of the
--   level 0 nodes in the diagram that are reachable from this node, the
--   indices of targets in the next lowest level, and the indices of the
--   sources in the next highest level. The entire diagram is simply an
--   array of arrays representing each particular level; index 0 represents
--   level 0, etc.
--   
--   Any directed graph can be encoded as an admissible Hasse diagram with
--   2 levels. The edges are level 1 and the vertices are level 0. The
--   ordering on the targets of a node representing an edge is simply the
--   terminal vertex first and the initial vertex second. This may be
--   counterintuitive, but its helpful to interpret an arrow between two
--   vertices as the "&lt;" operator. This induces a linear ordering on the
--   vertices of any acyclic complete subgraph - which is what the nodes in
--   the Hasse diagram of the directed clique complex represent.
--   
--   Any oriented simplicial complex can also be encoded as an admissible
--   Hasse diagram. A node is a simplex, the targets are the faces of the
--   simplex, and the sources are simplices of which the given simplex is a
--   face.
--   
--   The main feature of this module is an algorithm which takes the Hasse
--   diagram of a directed graph and generates the Hasse diagram of the
--   directed flag complex - the simplicial complex whose simplices are
--   acyclic complete subgraphs of the given graph. Here acyclic refers to
--   a directed graph without any sequence of arrows whose heads and tails
--   match up and which has the same start and end vertex.
--   
--   The idea is that, if your directed graph represents any kind of
--   information flow, "sub-modules" in the network are groups of nodes
--   that simply take input, process it, and then output it without
--   spinning the information around at all. These "sub-modules" are the
--   directed cliques/flags which I've been referring to as acyclic
--   complete subgraphs up to this point. Constructing a simplicial complex
--   out of them will allow you to both simplify the 1-dimensional topology
--   of the network and possibly detect higher-dimensional topological
--   features.
--   
--   The algorithm for constructing the directed clique complex comes from
--   this paper by Markram et al:
--   <a>https://www.frontiersin.org/articles/10.3389/fncom.2017.00048/full</a>.
module Persistence.HasseDiagram

-- | Type representing a node in a Hasse diagram. Hasse diagrams are being
--   used to represent simplicial complexes so each node represents a
--   simplex. Contents of the tuple in order: Vector of references to
--   vertices of the underlying directed graph, vector of references to the
--   simplices faes in the next lowest level of the Hasse diagram, vector
--   of references to "parent" simplices (simplices who have this simplex
--   as a face) in the next highest level of the Hasse diagram.
type Node = (Vector Int, Vector Int, Vector Int)

-- | Type representing an admissible Hasse diagram. Each entry in the
--   vector represents a level in the Hasse diagram.
type HasseDiagram = Vector (Vector Node)

-- | Simple printing function for Hasse diagrams.
hsd2String :: HasseDiagram -> String

-- | Given the number of vertices in a directed graph, and pairs
--   representing the direction of each edge, construct a 1-dimensional
--   simplicial complex in the canonical way. Betti numbers of this
--   simplicial complex can be used to count cycles and connected
--   components.
dGraph2sc :: Int -> [(Int, Int)] -> SimplicialComplex

-- | Given the number of vertices in a directed graph, and pairs
--   representing the direction of each edge (initial, terminal), construct
--   a Hasse diagram representing the graph.
encodeDirectedGraph :: Int -> [(Int, Int)] -> HasseDiagram

-- | Given a Hasse diagram representing a directed graph, construct the
--   diagram representing the directed clique/flag complex of the graph.
--   Algorithm adapted from the one shown in the supplementary materials of
--   this paper:
--   <a>https://www.frontiersin.org/articles/10.3389/fncom.2017.00048/full</a>
directedFlagComplex :: HasseDiagram -> HasseDiagram

-- | Convert a Hasse diagram to a simplicial complex.
hDiagram2sc :: HasseDiagram -> SimplicialComplex


-- | This module contains functions for constructing filtrations and
--   computing persistent homology, persistence landscapes, and computing
--   bottleneck distance between barcode diagrams.
--   
--   A filtration is a finite sequence of simplicial complexes where each
--   complex is a subset of the next. This means that a filtration can be
--   thought of as a single simplicial complex where each of the simplices
--   is labeled with a "filtration index" that represents the index in the
--   sequence where that simplex enters the filtration.
--   
--   One way to create a filtration, given a simplicial complex, a metric
--   for the vertices, and a list of distances, is to loop through the
--   distances from greatest to least: create a simplicial complex each
--   iteration which excludes simplices that contain pairs of vertices
--   which are further than the current distance apart. This method will
--   produce a filtration of Vietoris-Rips complexes - each filtration
--   index will correspond to a Rips complex whose scale is the
--   corresponding distance. This filtration represents the topology of the
--   data at each of the scales with which it was constructed.
--   
--   NOTE: It's important that, even though the smallest filtration index
--   represents the smallest scale at which the data is being anaylzed, all
--   functions in this library receive your list of scales sorted in
--   *decreasing* order.
--   
--   An essential thing to note in this library is the distinction between
--   "fast" and "light" functions. Light functions call the metric every
--   time distance between two points is required, which is a lot. Fast
--   functions store the distances between points and access them in
--   constant time, BUT this means they use O(n^2) memory with respect to
--   the number of data points, so it's a really bad idea to use this
--   optimization on substantially large data if you don't have a lot of
--   RAM.
--   
--   Persistent homology is the main event of topological data analysis. It
--   allows one to identify clusters, tunnels, cavities, and higher
--   dimensional holes that persist in the data throughout many scales. The
--   output of the persistence algorithm is a barcode diagram. A single
--   barcode represents the filtration index where a feature appears and
--   the index where it disappears (if it does). Alternatively, a barcode
--   can represent the scale at which a feature and the scale at which it
--   ends. Thus, short barcodes are typically interpretted as sampling
--   irregularities and long barcodes are interpretted as actual features
--   of whatever the underlying data set represents. In this context, what
--   a feature *is* depends on which dimension the barcode diagram is;
--   0-dimensional features are connected components, 1-dimensional
--   features are loops or tunnels, 2-dimensional features are hollow
--   volumes, and higher dimensional features correspond to
--   heigher-dimensional cavities.
--   
--   After you've got the barcodes of a data set, you might want to compare
--   it with that of a different data set. This is the purpose of
--   bottleneck distance, which corresponds to the Hausdorff distance
--   between barcode diagrams.
--   
--   Another way to compare barcode diagrams is by using persistence
--   landscapes. The peristence landscape of a barcode diagram is a finite
--   sequence of piecewise-linear, real-valued functions. This means they
--   can be used to take averages and compute distances between barcode
--   diagrams. See "A Persistence Landscapes Toolbox For Topological
--   Statistics" by Bubenik and Dlotko for more information.
--   
--   WARNING: The persistence landscape functions have not been fully
--   tested. Use them with caution. If you get any errors or unexpected
--   output, please don't hesitate to email me.
module Persistence.Filtration

-- | This type synonym exists to make other synonyms more concise. Each
--   simplex in a filtration is represented as a triple: its filtration
--   index, the indices of its vertices in the original data, and the
--   indices of its faces in the next lowest dimension. Edges do not have
--   reference to their faces, as it would be redundant with their
--   vertices. All simplices are sorted according to filtration index upon
--   construction of the filtration. In each dimension, all simplices are
--   sorted in increasing order of filtration index, and every simplices
--   face indices are sorted in decreasing order; both of these facts are
--   critical to the computation of persistent homology.
type FilterSimplex = (Int, Vector Int, Vector Int)

-- | A type representing a filtration whose vertices all have filtration
--   index 0. Slightly faster and slightly less memory usage. The first
--   component is simply the number of vertices. The second component is a
--   vector with an entry for each dimension of simplices, starting at
--   dimension 1 for edges.
type SimpleFiltration = (Int, Vector (Vector FilterSimplex))

-- | Representation of a filtration which, unlike SimpleFiltration, can
--   cope with vertices that have a non-zero filtration index. Vertices of
--   the filtration are represented like all other simplices except that
--   they don't their own have vertices or faces.
--   
--   Note that, since this library currently only deals with static
--   pointcloud data, all of the filtration construction functions produce
--   vertices whose filtration index is 0. Thus, if you want to use this
--   type you will have to construct the instances yourself.
type Filtration = Vector (Vector FilterSimplex)

-- | A type extending the number line to positive and negative infinity.
--   Used for representing infinite barcodes, bottleneck distance, and
--   persistence landscapes.
data Extended a
Finite :: a -> Extended a
Infinity :: Extended a
MinusInfty :: Extended a

-- | (x, Finite y) is a topological feature that appears at the index or
--   scale x and disappears at the index or scale y. (x, Infinity) begins
--   at x and doesn't disappear.
type BarCode a = (a, Extended a)

-- | A Persistence landscape is a certain type of piecewise linear function
--   based on a barcode diagram. It can be represented as a list of
--   critical points paired with critical values. Useful for taking
--   averages and differences between barcode diagrams.
type Landscape = Vector (Vector (Extended Double, Extended Double))

-- | Shows all the information in a simplex.
sim2String :: FilterSimplex -> String

-- | Shows all the information in a filtration.
filtr2String :: Either SimpleFiltration Filtration -> String

-- | Gets the simplicial complex specified by the filtration index. This is
--   O(n) with respect to the number of simplices.
getComplex :: Int -> Either SimpleFiltration Filtration -> SimplicialComplex

-- | Return the dimension of the highest dimensional simplex in the
--   filtration (constant time).
getDimension :: Either SimpleFiltration Filtration -> Int

-- | Convert a simple filtration into an ordinary filtration.
simple2Filtr :: SimpleFiltration -> Filtration

-- | This function creates a filtration out of a simplicial complex by
--   removing simplices that contain edges that are too long for each scale
--   in the list. This is really a helper function to be called by
--   makeRipsFiltrationFast, but I decided to expose it in case you have a
--   simplicial complex and weighted graph lying around. The scales MUST be
--   in decreasing order.
filterByWeightsFast :: Ord a => Either (Vector a) [a] -> (SimplicialComplex, Graph a) -> SimpleFiltration

-- | This function constructs a filtration of the Vietoris-Rips complexes
--   associated with the scales. Note that this a fast function, meaning it
--   uses O(n^2) memory to quickly access distances where n is the number
--   of data points.
makeRipsFiltrationFast :: (Ord a, Eq b) => Either (Vector a) [a] -> (b -> b -> a) -> Either (Vector b) [b] -> SimpleFiltration

-- | Same as above except it uses parallelism when computing the
--   Vietoris-Rips complex of the largest scale.
makeRipsFiltrationFastPar :: (Ord a, Eq b) => Either (Vector a) [a] -> (b -> b -> a) -> Either (Vector b) [b] -> SimpleFiltration

-- | The same as filterbyWeightsFast except it uses far less memory at the
--   cost of speed. Note that the scales must be in decreasing order.
filterByWeightsLight :: Ord a => Either (Vector a) [a] -> (b -> b -> a) -> Either (Vector b) [b] -> SimplicialComplex -> SimpleFiltration

-- | Constructs the filtration of Vietoris-Rips complexes corresponding to
--   each of the scales.
makeRipsFiltrationLight :: (Ord a, Eq b) => Either (Vector a) [a] -> (b -> b -> a) -> Either (Vector b) [b] -> SimpleFiltration

-- | Same as above except it uses parallelism when computing the
--   Vietoris-Rips complex of the largest scale.
makeRipsFiltrationLightPar :: (Ord a, Eq b) => Either (Vector a) [a] -> (b -> b -> a) -> Either (Vector b) [b] -> SimpleFiltration

-- | The nth entry in the list will describe the n-dimensional topology of
--   the filtration. That is, the first list will represent clusters, the
--   second list will represent tunnels or punctures, the third will
--   represent hollow volumes, and the nth index list will represent
--   n-dimensional holes in the data. Features are encoded by the
--   filtration indices where they appear and disappear.
indexBarCodes :: Filtration -> Vector (Vector (BarCode Int))

-- | Same as above except this function acts on filtrations whose vertices
--   all have filtration index zero (for a very slight speedup).
indexBarCodesSimple :: SimpleFiltration -> Vector (Vector (BarCode Int))

-- | The nth entry in the list will describe the n-dimensional topology of
--   the filtration. However, features are encoded by the scales where they
--   appear and disappear. For consistency, scales must be in decreasing
--   order.
scaleBarCodes :: Either (Vector a) [a] -> Filtration -> Vector (Vector (BarCode a))

-- | Same as above except acts only on filtrations whose vertices all have
--   filtration index 0. Note that scales must be in decreasing order.
scaleBarCodesSimple :: Either (Vector a) [a] -> SimpleFiltration -> Vector (Vector (BarCode a))

-- | The standard (Euclidean) metric between index barcodes. The distance
--   between infinite and finite barcodes is infinite, and the distance
--   between two infinite barcodes is the absolute value of the difference
--   of their fst component.
indexMetric :: BarCode Int -> BarCode Int -> Extended Double

-- | Given a metric, return the Hausdorff distance (referred to as
--   bottleneck distance in TDA) between the two sets. Returns nothing if
--   either list of barcodes is empty.
bottleNeckDistance :: Ord b => (BarCode a -> BarCode a -> Extended b) -> Vector (BarCode a) -> Vector (BarCode a) -> Maybe (Extended b)

-- | Get's all the bottleneck distances; a good way to determine the
--   similarity of the topology of two filtrations.
bottleNeckDistances :: Ord b => (BarCode a -> BarCode a -> Extended b) -> Vector (Vector (BarCode a)) -> Vector (Vector (BarCode a)) -> Vector (Maybe (Extended b))

-- | Compute the persistence landscape of the barcodes for a single
--   dimension.
calcLandscape :: Vector (BarCode Int) -> Landscape

-- | Evaluate the nth function in the landscape for the given point.
evalLandscape :: Landscape -> Int -> Extended Double -> Extended Double

-- | Evaluate all the real-valued functions in the landscape.
evalLandscapeAll :: Landscape -> Extended Double -> Vector (Extended Double)

-- | Compute a linear combination of the landscapes. If the coefficient
--   list is too short, the rest of the coefficients are assumed to be
--   zero. If it is too long, the extra coefficients are discarded.
linearComboLandscapes :: [Double] -> [Landscape] -> Landscape

-- | Average the persistence landscapes.
avgLandscapes :: [Landscape] -> Landscape

-- | Subtract the second landscape from the first.
diffLandscapes :: Landscape -> Landscape -> Landscape

-- | If p&gt;=1 then it will compute the L^p norm on the given interval.
--   Uses trapezoidal approximation. You should ensure that the stepsize
--   partitions the interval evenly.
normLp :: Extended Double -> (Double, Double) -> Double -> Landscape -> Maybe Double

-- | Given the same information as above, computes the L^p distance between
--   the two landscapes. One way to compare the topologies of two
--   filtrations.
metricLp :: Extended Double -> (Double, Double) -> Double -> Landscape -> Landscape -> Maybe Double
instance GHC.Show.Show a => GHC.Show.Show (Persistence.Filtration.Extended a)
instance GHC.Classes.Eq a => GHC.Classes.Eq (Persistence.Filtration.Extended a)
instance (GHC.Classes.Ord a, GHC.Classes.Eq a) => GHC.Classes.Ord (Persistence.Filtration.Extended a)
instance GHC.Num.Num a => GHC.Num.Num (Persistence.Filtration.Extended a)