Safe Haskell | None |
---|---|

Language | Haskell2010 |

## Synopsis

- newtype B = B {}
- newtype B1 = B1 {}
- newtype B2 = B2 {}
- type LabelVector = SparseMatrixXd
- spectral :: Int -> Int -> B -> SparseMatrixXd
- spectralCluster :: B -> LabelVector
- spectralClusterK :: Int -> Int -> B -> LabelVector
- getB :: Bool -> SparseMatrixXd -> B
- b1ToB2 :: B1 -> B2
- getSimilarityFromB2 :: B2 -> Int -> Int -> Double

# Documentation

Normed rows of B2. For a complete explanation, see Shu et al., "Efficient Spectral Neighborhood Blocking for Entity Resolution", 2011.

B1 observation by feature matrix.

B2 term frequency-inverse document frequency matrix of B1.

type LabelVector = SparseMatrixXd Source #

Output vector containing cluster assignment (0 or 1).

spectral :: Int -> Int -> B -> SparseMatrixXd Source #

Returns the second left singular vector (or Nth) of a sparse spectral process. Assumes the columns are features and rows are observations. B is the normalized matrix (from getB). See Shu et al., "Efficient Spectral Neighborhood Blocking for Entity Resolution", 2011.

spectralCluster :: B -> LabelVector Source #

Returns a vector of cluster labels for two groups by finding the second left singular vector of a special normalized matrix. Assumes the columns are features and rows are observations. B is the normalized matrix (from getB). See Shu et al., "Efficient Spectral Neighborhood Blocking for Entity Resolution", 2011.

spectralClusterK :: Int -> Int -> B -> LabelVector Source #

Returns a vector of cluster labels for two groups by finding the largest singular vectors and on of a special normalized matrix and running kmeans. Assumes the columns are features and rows are observations. B is the normalized matrix (from getB). See Shu et al., "Efficient Spectral Neighborhood Blocking for Entity Resolution", 2011.