gsc-weighting-0.1.0.2: Generic implementation of Gerstein/Sonnhammer/Chothia weighting.

Data.Weighting.GSC

Synopsis

Documentation

gsc :: Fractional d => Dendrogram d a -> Dendrogram d (a, d)Source

O(n^2) Calculates the Gerstein/Sonnhammer/Chothia weights for all elements of a dendrogram. Weights are annotated to the leafs of the dendrogram while distances in branches are kept unchanged.

Distances `d` in branches should be non-increasing and between `0` (in the leafs) and `1`. The final weights are normalized to average to `1` (i.e. sum to the number of sequences, the same they would sum if all weights were `1`).

For example, suppose we have

``` dendro = Branch 0.8
(Branch 0.5
(Branch 0.2
(Leaf `A`)
(Leaf `B`))
(Leaf `C`))
(Leaf `D`)
```

This is the same as GSC paper's example, however they used similarities while we are using distances (i.e. applying `(1-)` to the distances would give exactly their example). Then `gsc dendro` is

``` gsc dendro == Branch 0.8
(Branch 0.5
(Branch 0.2
(Leaf (`A`,0.7608695652173914))
(Leaf (`B`,0.7608695652173914)))
(Leaf (`C`,1.0869565217391306)))
(Leaf (`D`,1.3913043478260871))
```

which is exactly what they calculated.