bio-0.4.5: A bioinformatics librarySource codeContentsIndex
Bio.Alignment.AlignData
Contents
Data types for gap-based alignemnts
Helper functions
Data types for edit-based alignments
Helper functions
Description

Data structures and helper functions for calculating alignments

There are two ways to view an alignment: either as a list of edits (i.e., insertions, deletions, or substitutions), or as a set of sequences with inserted gaps.

The edit list approach is perhaps more restrictive model but doesn't generalize to multiple alignments.

The gap approach is more general, and probably more commonly used by other software (see e.g. the ACE file format).

Synopsis
data Dir
= Fwd
| Rev
type Gaps = [Offset]
type Alignment a = [(Offset, Dir, Sequence a, Gaps)]
extractGaps :: SeqData -> (SeqData, Gaps)
insertGaps :: Char -> (SeqData, Gaps) -> SeqData
data Edit
= Ins Chr
| Del Chr
| Repl Chr Chr
type EditList = [Edit]
type SubstMx t a = (Chr, Chr) -> a
type Selector a = [(a, Edit)] -> a
type Chr = Word8
columns :: Selector a -> a -> Sequence b -> Sequence b -> [[a]]
eval :: SubstMx t a -> a -> Edit -> a
isRepl :: Edit -> Bool
on :: (t1 -> t1 -> t2) -> (t -> t1) -> t -> t -> t2
showalign :: EditList -> [Char]
toStrings :: EditList -> (String, String)
Data types for gap-based alignemnts
data Dir Source
Constructors
Fwd
Rev
show/hide Instances
type Gaps = [Offset]Source
type Alignment a = [(Offset, Dir, Sequence a, Gaps)]Source
Helper functions
extractGaps :: SeqData -> (SeqData, Gaps)Source
Gaps are coded as *s, this function removes them, and returns the sequence along with the list of gap positions. note that gaps are positioned relative to the *gapped* sequence (contrast to stmassembler/Cluster.hs)
insertGaps :: Char -> (SeqData, Gaps) -> SeqDataSource
Data types for edit-based alignments
data Edit Source
An Edit is either the insertion, the deletion, or the replacement of a character.
Constructors
Ins Chr
Del Chr
Repl Chr Chr
show/hide Instances
type EditList = [Edit]Source
An alignment is a sequence of edits.
type SubstMx t a = (Chr, Chr) -> aSource
A substitution matrix gives scores for replacing a character with another. Typically, it will be symmetric. It is type-tagged with the alphabet - Nuc or Amino.
type Selector a = [(a, Edit)] -> aSource
A Selector consists of a zero element, and a funcition that chooses a possible Edit operation, and generates an updated result.
type Chr = Word8Source
The sequence element type, used in alignments.
Helper functions
columns :: Selector a -> a -> Sequence b -> Sequence b -> [[a]]Source
Calculate a set of columns containing scores This represents the columns of the alignment matrix, but will only require linear space for score calculation.
eval :: SubstMx t a -> a -> Edit -> aSource
Evaluate an Edit based on SubstMx and gap penalty
isRepl :: Edit -> BoolSource
True if the Edit is a Repl.
on :: (t1 -> t1 -> t2) -> (t -> t1) -> t -> t -> t2Source
showalign :: EditList -> [Char]Source
toStrings :: EditList -> (String, String)Source
turn an alignment into sequences with - representing gaps (for checking, filtering out the - characters should return the original sequences, provided - isn't part of the sequence alphabet)
Produced by Haddock version 2.6.1