Infernal CMs.
- data CM = CM {
- name :: ModelIdentification
- accession :: ModelAccession
- trustedCutoff :: BitScore
- gathering :: BitScore
- noiseCutoff :: Maybe BitScore
- transition :: PrimArray (Int, Int) Double
- emission :: PrimArray (Int, Int) Double
- paths :: Vector (Vector Double)
- localBegin :: Vector Double
- begins :: Vector Int
- localEnd :: Vector Double
- nodes :: Vector (Vector Int)
- type ID2CM = Map ModelIdentification CM
- type AC2CM = Map ModelAccession CM
Documentation
A datatype representing Infernal covariance models. This is a new representation that is incompatible with the one once found in Biobase. The most important difference is that lookups are mapped onto efficient data structures, currently PrimitiveArray.
- 1
- Each State of a covariance model has up to 6 transition scores, hence we need s*6 cells for transitions.
- 2
- Each State of a covariance has up to 16 emission scores, so we have s*16 cells for emissions, with unused cells set to a really high score.
On top of these basic structures, we then place additional high-level constructs.
- 3
-
paths
are allowed transitions. This can safe a check, if the transition is encoded with a forbidden score. - 4
-
localBegin
andlocalEnd
are local entry and exit strategies. AlocalBegin
is a transition score to certain states, all such transitions are inbegins
. AlocalEnd
is a transition score to a local end state.
NOTE that trustedCutoff > gathering > noiseCutoff
TODO as with other projects, we should not use Double's but Score and Probability newtypes.
CM | |
|
type ID2CM = Map ModelIdentification CMSource
Map of model names to individual CMs.
type AC2CM = Map ModelAccession CMSource
Map of model accession numbers to individual CMs.