Safe Haskell | None |
---|---|

Language | Haskell2010 |

## Synopsis

- newtype B = B {}
- newtype B1 = B1 {}
- newtype B2 = B2 {}
- type AdjacencyMatrix = SpMatrix Double
- type LabelVector = SpVector Double
- spectral :: Int -> Int -> B -> [SpVector Double]
- spectralCluster :: B -> LabelVector
- spectralClusterK :: Int -> Int -> B -> LabelVector
- spectralNorm :: Int -> Int -> AdjacencyMatrix -> [SpVector Double]
- spectralClusterNorm :: AdjacencyMatrix -> LabelVector
- spectralClusterKNorm :: Int -> Int -> AdjacencyMatrix -> LabelVector
- getB :: Bool -> SpMatrix Double -> B
- b1ToB2 :: B1 -> B2
- getSimilarityFromB2 :: B2 -> Int -> Int -> Double

# Documentation

Normed rows of B2. For a complete explanation, see Shu et al., "Efficient Spectral Neighborhood Blocking for Entity Resolution", 2011.

B1 observation by feature matrix.

B2 term frequency-inverse document frequency matrix of B1.

type AdjacencyMatrix = SpMatrix Double Source #

Adjacency matrix input.

type LabelVector = SpVector Double Source #

Output vector containing cluster assignment (0 or 1).

spectral :: Int -> Int -> B -> [SpVector Double] Source #

Returns the second left singular vector (or from N) and E on of a sparse spectral process. Assumes the columns are features and rows are observations. B is the normalized matrix (from getB). See Shu et al., "Efficient Spectral Neighborhood Blocking for Entity Resolution", 2011.

spectralCluster :: B -> LabelVector Source #

Returns a vector of cluster labels for two groups by finding the second left singular vector of a special normalized matrix. Assumes the columns are features and rows are observations. B is the normalized matrix (from getB). See Shu et al., "Efficient Spectral Neighborhood Blocking for Entity Resolution", 2011.

spectralClusterK :: Int -> Int -> B -> LabelVector Source #

Returns a vector of cluster labels for two groups by finding the second left singular vector and on of a special normalized matrix and running kmeans. Assumes the columns are features and rows are observations. B is the normalized matrix (from getB). See Shu et al., "Efficient Spectral Neighborhood Blocking for Entity Resolution", 2011.

spectralNorm :: Int -> Int -> AdjacencyMatrix -> [SpVector Double] Source #

Returns the eigenvector with the second smallest eigenvalue (or N start) and E on of the symmetric normalized Laplacian L. Computes real symmetric part of L, so ensure the input is real and symmetric. Diagonal should be 0s for adjacency matrix. Uses I + Lnorm instead of I - Lnorm to find second largest singular value instead of second smallest for Lnorm.

spectralClusterNorm :: AdjacencyMatrix -> LabelVector Source #

Returns the eigenvector with the second smallest eigenvalue of the symmetric normalized Laplacian L. Computes real symmetric part of L, so ensure the input is real and symmetric. Diagonal should be 0s for adjacency matrix. Clusters the eigenvector by sign.

spectralClusterKNorm :: Int -> Int -> AdjacencyMatrix -> LabelVector Source #

Returns the eigenvector with the second smallest eigenvalue and on of the symmetric normalized Laplacian L. Computes real symmetric part of L, so ensure the input is real and symmetric. Diagonal should be 0s for adjacency matrix. Clusters the eigenvector using kmeans into k groups.