-- Hoogle documentation, generated by Haddock
-- See Hoogle, http://www.haskell.org/hoogle/


-- | Quickly detect clusters and holes in data.
--   
--   Persistence is a topological data analysis library motivated by
--   flexibility when it comes to the type of data being analyzed. If you
--   have data that comes with a meaningful function into something of the
--   Ord typeclass, you can use Persistence to detect clusters and holes in
--   the data. You can also use the library to analyze the topology of
--   directed/undirected weighted/unweighted graphs, and compare topologies
--   of different data sets.
@package Persistence
@version 1.1


-- | This module provides functions for constructing neighborhood graphs,
--   clique complexes, Vietoris-Rips complexes, as well as the computation
--   of Betti numbers.
--   
--   An important thing to know about this module is the difference between
--   "fast" and "light" functions. Fast functions encode the metric in a 2D
--   vector, so that distances don't need to be computed over and over
--   again and can instead be accessed in constant time. Unfortunately,
--   this takes O(n^2) memory so I would strongly recomend against using it
--   for larger data sets; this is why "light" functions exist.
--   
--   A neighborhood graph is a graph where an edge exists between two
--   vertices if and only if the vertices fall within a given distance of
--   each other. Graphs here are more of a helper data type for
--   constructing filtrations, which is why they have a somewhat odd
--   representation. Not to worry though, I've supplied a way of encoding
--   graphs of a more generic representation.
--   
--   The clique complex of a graph is a simplicial complex whose simplices
--   are the complete subgraphs of the given graph. The Vietoris-Rips
--   complex is the clique complex of the neighborhood graph.
--   
--   Betti numbers measure the number of n-dimensional holes in a given
--   simplicial complex. The 0th Betti number is the number of connected
--   components, or clusters; the 1st Betti number is the number of loops
--   or tunnels; the 2nd Betti number measures voids or hollow volumes; and
--   if you have high-dimensional data, you might have higher Betti numbers
--   representing the number of high-dimensional holes.
module SimplicialComplex

-- | The first component of the pair is the number of vertices. The second
--   component is a vector whose entries are vectors of simplices of the
--   same dimension. Index 0 of the vecor corresponds to dimension 1
--   because there is no need to store individual vertices. A simplex is
--   represented as a pair: the vector of its vertices (represent by their
--   index in the original data set), and the vector of the indices of the
--   faces in the next lowest dimension. Edges are the exception to the
--   last part - they do not store reference to their faces because it
--   would be redundant with their vertices.
type SimplicialComplex = (Int, Vector (Vector (Vector Int, Vector Int)))

-- | This represents the (symmetric) adjacency matrix of some weighted
--   undirected graph. The type <tt>a</tt> is whatever distance is in your
--   data analysis regime. The reason graphs are represented like this is
--   because their main function is too speed up the construction of
--   simplicial complexes and filtrations. If the clique complex of this
--   graph were to be constructed, only the adjacencies would matter. But
--   if a filtration is constructed all the distances will be required over
--   and over again - this allows them to be accessed in constant time.
type Graph a = Vector (Vector (a, Bool))

-- | Show all the information in a simplicial complex.
sc2String :: SimplicialComplex -> String

-- | Get the dimension of the highest dimensional simplex (constant time).
getDim :: SimplicialComplex -> Int

-- | Takes the number of vertices and each edge paired with its weight to
--   output the graph encoded as a 2D vector. If only you have an
--   unweighted graph, you can still encode your graph by simply letting
--   the type <tt>a</tt> be `()`. If you have a weighted graph but there
--   isn't a distance between every vertex, you can use the
--   <tt>Extended</tt> type (essentially the same as Maybe) from the
--   Persistence module which is already an instance of Ord.
encodeWeightedGraph :: Int -> (Int -> Int -> (a, Bool)) -> Graph a

-- | The first argument is a scale, the second is a metric, and the third
--   is the data. This function records the distance between every element
--   of the data and whether or not it is smaller than the given scale.
makeNbrhdGraph :: (Ord a, Eq b) => a -> (b -> b -> a) -> [b] -> Graph a

-- | Makes a simplicial complex where the simplices are the complete
--   subgraphs (cliques) of the given graph. Mainly a helper function for
--   makeVRComplexFast, but it might be useful if you happen to have a
--   graph you want to analyze. This utilizes any available processors in
--   parallel because the construction is quite expensive.
makeCliqueComplex :: Graph a -> SimplicialComplex

-- | Constructs the Vietoris-Rips complex given a scale, metric, and data
--   set. Also uses O(n^2) memory (where n is the number of data points)
--   for a graph storing all the distances between data points. This
--   utilizes any available processors in parallel because the construction
--   is quite expensive.
makeVRComplexFast :: (Ord a, Eq b) => a -> (b -> b -> a) -> [b] -> (SimplicialComplex, Graph a)

-- | Constructs the Vietoris-Rips complex given a scale, metric, and data
--   set. This utilizes any available processors in parallel because the
--   construction is quite expensive.
makeVRComplexLight :: (Ord a, Eq b) => a -> (b -> b -> a) -> [b] -> SimplicialComplex

-- | The nth index of the output corresponds to the nth Betti number. The
--   zeroth Betti number is the number of connected components, the 1st is
--   the number of tunnels/punctures, the 2nd is the number of hollow
--   volumes, and so on for higher-dimensional holes.
bettiNumbers :: SimplicialComplex -> [Int]

-- | Calculates all of the Betti numbers in parallel.
bettiNumbersPar :: SimplicialComplex -> [Int]


-- | This module contains functions for constructing filtrations and
--   computing persistent homology, as well as a few utility functions for
--   working with filtrations.
--   
--   A filtration is a finite sequence of simplicial complexes where each
--   complex is a subset of the next. This means that a filtration can be
--   thought of as a single simplicial complex where each of the simplices
--   is labeled with a "filtration index" that represents the index in the
--   sequence where that simplex enters the filtration.
--   
--   One way to create a filtration, given a simplicial complex, a metric
--   for the vertices, and a list of distances, is to loop through the
--   distances from greatest to least: create a simplicial complex each
--   iteration which excludes simplices that contain pairs of vertices
--   which are further than the current distance apart. This method will
--   produce a filtration of Vietoris-Rips complexes - each filtration
--   index will correspond to a VR complex whose scale is the corresponding
--   distance.
--   
--   An essential thing to note about the way this library is set up is the
--   distinction between "fast" and "light" functions. Light functions call
--   the metric every time distance between two points is required, which
--   is a lot. Fast functions store the distances between points and access
--   them in constant time, BUT this means they use O(n^2) memory with
--   respect to the number of data points, so it's a really bad idea to use
--   this optimization on substantially large data.
--   
--   IMPORTANT NOTE: This library currently only handles filtrations where
--   all the vertices have filtration index 0. Since I haven't thought of
--   any way filtrations that don't respect this property could come up in
--   applications, I have chosen to exclude them in favor of less memory
--   usage and a slight speedup.
--   
--   Persistent homology is the main event of topological data analysis. It
--   allows one to identify clusters, holes, and voids that persist in the
--   data throughout many scales. The output of the persistence algorithm
--   is a barcode diagram, which currently have a somewhat crude
--   representation in this library. A single barcode represents the
--   filtration index where a feature appears and the index where it
--   disappears (if it does). Thus, short barcodes are typically
--   interpretted as noise and long barcodes are interpretted as actual
--   features.
--   
--   After you've got the persistent homology of a data set, you might want
--   to compare it with that of a different data set. That's why this
--   release includes two versions of "bottleneck distance," one works only
--   if the number of features in each data set is the same and the other
--   works regardless.
module Persistence

-- | The first component is simply the number of vertices (all vertices are
--   assumed to have filtration index 0). The second component is a vector
--   with an entry for each dimension of simplices, starting at dimension 1
--   for edges. Each simplex is represented as a triple: its filtration
--   index, the indices of its vertices in the original data, and the
--   indices of its faces in the next lowest dimension. Edges do not have
--   reference to their faces, as it would be redundant with their
--   vertices. All simplices are sorted according to filtration index upon
--   construction of the filtration.
type Filtration = (Int, Vector (Vector (Int, Vector Int, Vector Int)))

-- | (i, Just j) is a feature that appears at filtration index i and
--   disappears at index j, (i, Nothing) begins at i and doesn't disappear.
type BarCode = (Int, Maybe Int)

-- | Type for representing inifinite bottleneck distance.
data Extended a

-- | Shows all the information in a simplex.
sim2String :: (Int, Vector Int, Vector Int) -> String

-- | Shows all the information in a filtration.
filtr2String :: Filtration -> String

-- | Gets the simplicial complex specified by the filtration index. This is
--   O(n) with respect to the number of simplices.
getComplex :: Int -> Filtration -> SimplicialComplex

-- | The first argument is a list of dimensions, the second argument is a
--   list of filtration indices. The function returns the number of
--   simplices in the filtration whose dimension and index exist in the
--   respective lists.
getNumSimplices :: [Int] -> [Int] -> Filtration -> Int

-- | Return the dimension of the highest dimensional simplex in the
--   filtration (constant time).
getDimension :: Filtration -> Int

-- | Given a list of scales, a simplicial complex, and a weighted graph
--   (see SimplicialComplex) which encodes a metric on the vertices, this
--   function creates a filtration out of a simplicial complex by removing
--   simplices that contain edges that are too long for each scale in the
--   list. This is really a helper function to be called by
--   makeVRFiltrationFast, but I decided to expose it in case you have a
--   simplicial complex and weighted graph lying around. The scales MUST be
--   in decreasing order for this function.
filterByWeightsFast :: Ord a => [a] -> (SimplicialComplex, Graph a) -> Filtration

-- | Given a list of scales, a metric, and a data set, this function
--   constructs a filtration of the Vietoris-Rips complexes associated with
--   the scales. The scales MUST be in decreasing order. Note that this a
--   fast function, meaning it uses O(n^2) memory to quickly access
--   distances where n is the number of data points.
makeVRFiltrationFast :: (Ord a, Eq b) => [a] -> (b -> b -> a) -> [b] -> Filtration

-- | The same as filterbyWeightsFast except it uses far less memory at the
--   cost of speed. Note that the scales must be in decreasing order.
filterByWeightsLight :: Ord a => [a] -> (b -> b -> a) -> [b] -> SimplicialComplex -> Filtration

-- | Given a list of scales in decreasing order, a metric, and a data set,
--   this constructs the filtration of Vietoris-Rips complexes
--   corresponding to the scales.
makeVRFiltrationLight :: (Ord a, Eq b) => [a] -> (b -> b -> a) -> [b] -> Filtration

-- | Each index in the list is a list of barcodes whose dimension
--   corresponds to the index. That is, the first list will represent
--   clusters, the second list will represent tunnels or punctures, the
--   third will represent hollow volumes, and the nth index list will
--   represent n-dimensional holes in the data.
persistentHomology :: Filtration -> [[BarCode]]

-- | Return the maximum of minimum distances bewteen the bar codes. It's
--   important to note that the function isn't "unsafe" in the sense that
--   it will throw an exception, it will just give you a distance
--   regardless of whether or not there is the same number of barcodes is
--   in each list.
bottelNeckDistance :: [BarCode] -> [BarCode] -> Extended Double

-- | Get's all the bottle neck distances; a good way to determine the
--   similarity of the topology of two filtrations.
bottelNeckDistances :: [[BarCode]] -> [[BarCode]] -> [Extended Double]

-- | If the number of barcodes is the same, return the maximum of minimum
--   distances bewteen the bar codes. Otherwise return nothing.
safeBottelNeckDistance :: [BarCode] -> [BarCode] -> Maybe (Extended Double)

-- | Safely get all the bottle neck distances; a good way to determine the
--   similarity of the topology of two filtrations.
safeBottelNeckDistances :: [[BarCode]] -> [[BarCode]] -> [Maybe (Extended Double)]
instance GHC.Classes.Eq a => GHC.Classes.Eq (Persistence.Extended a)
instance (GHC.Classes.Ord a, GHC.Classes.Eq a) => GHC.Classes.Ord (Persistence.Extended a)


-- | This module implements algorithms for admissible Hasse diagrams. A
--   Hasse diagram is admissible if it is stratified and oriented. A
--   diagram is stratified if all the vertices can be arranged in rows such
--   that all the sources of each vertex are in the next highest row and
--   all the targets are in the next lowest row. A diagram is oriented if
--   every vertex has a linear ordering on its targets.
--   
--   A node in the diagram is represented as a tuple: the indices of the
--   level 0 nodes in the diagram that are reachable from this node, the
--   indices of targets in the next lowest level, and the indices of the
--   sources in the next highest level. The entire diagram is simply an
--   array where each entry is an array representing a particular level;
--   index 0 represents level 0, etc.
--   
--   Any directed graph can be encoded as an admissible Hasse diagram with
--   2 levels. The edges are level 1 and the vertices are level 0. The
--   ordering on the targets of a node representing an edge is simply the
--   terminal vertex first and the initial vertex second. This may be
--   counterintuitive, but its helpful to interpret an arrow between two
--   vertices as the "&lt;" operator. This induces a linear ordering on the
--   vertices of any acyclic complete subgraph - which is what the nodes in
--   the Hasse diagram of the directed clique complex represent.
--   
--   Any oriented simplicial complex can also be encoded as an admissible
--   Hasse diagram. A node is a simplex, the targets are the faces of the
--   simplex, and the sources are simplices of which the given simplex is a
--   face.
--   
--   The main feature of this module is an algorithm which takes the Hasse
--   diagram of a directed graph and generates the Hasse diagram of the
--   directed flag complex - the simplicial complex whose simplices are
--   acyclic complete subgraphs of the given graph. Here acyclic refers to
--   a directed graph without any sequence of arrows whose heads and tails
--   match up an which has the same start and end vertex.
--   
--   The idea is that, if your directed graph represents any kind of
--   information flow, "sub-modules" in the network that simply take input,
--   process it using its nodes, and then output it without spinning the
--   information around at all. These "sub-modules" are the directed
--   cliques/flags which I've been referring to as acyclic complete
--   subgraphs up to this point. Constructing a simplicial complex out of
--   them will allow you to both simplify the 1-dimensional topology of the
--   network and possibly detect higher-dimensional topological features.
module HasseDiagram

-- | Type representing a node in a Hasse diagram. Hasse diagrams are being
--   used to represent simplicial complexes so each node represents a
--   simplex Contents of the tuple in order: Vector of references to
--   vertices of the underlying directed graph, vector of references to the
--   simplices faes in the next lowest level of the Hasse diagram, vector
--   of references to "parent" simplices (simplices who have this simplex
--   as a face) in the next highest level of the Hasse diagram.
type Node = (Vector Int, Vector Int, Vector Int)

-- | Type representing an admissible Hasse diagram. Each entry in the
--   vector represents a level in the Hasse diagram.
type HasseDiagram = Vector (Vector Node)

-- | Simple printing function for Hasse diagrams.
hsd2String :: HasseDiagram -> String

-- | Given the number of vertices in a directed graph, and pairs
--   representing the direction of each edge (initial, terminal), construct
--   a Hasse diagram representing the graph.
encodeDirectedGraph :: Int -> [(Int, Int)] -> HasseDiagram

-- | Given a Hasse diagram representing a directed graph, construct the
--   diagram representing the directed clique/flag complex of the graph.
directedFlagComplex :: HasseDiagram -> HasseDiagram

-- | Convert a Hasse diagram to a simplicial complex.
toSimplicialComplex :: HasseDiagram -> SimplicialComplex