bio-0.4.8: A bioinformatics library




Common substitution matrices for alignments.

When in doubt, use BLOSUM62. Consult for some hints on good parameters for nucleotide alignments.

See also for a summary about the difference between the different matrices.


BLOSUM matrices

For BLOSUM matrices, the associated number determines the similarity of the sequences the matrices are derived from.

Henikoff, S. and Henikoff, J. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA. 89(biochemistry): 10915 - 10919 (1992).

blosum45 :: (Chr, Chr) -> IntSource

BLOSUM45 matrix, suitable for distantly related sequences

blosum62 :: (Chr, Chr) -> IntSource

The standard BLOSUM62 matrix.

blosum80 :: (Chr, Chr) -> IntSource

BLOSUM80 matrix, suitable for closely related sequences.

PAM matrices

For PAM matrics, the number indicates the number of mutations that have occurred between the sequences that are compared.

Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C. A model of evolutionary change in proteins. In "Atlas of Protein Sequence and Structure" 5(3) M.O. Dayhoff (ed.), 345 - 352 (1978).

pam30 :: (Chr, Chr) -> IntSource

The standard PAM30 matrix

pam70 :: (Chr, Chr) -> IntSource

The standard PAM70 matrix.

BLASTn defaults, for nucleotide sequences

blastn_default :: Num a => (Chr, Chr) -> aSource

Blast defaults, use with gap_open = -5 gap_extend = -3 This should really check for valid nucleotides, and perhaps be more lenient in the case of Ns. Oh well.

Generic and simple matrix generator

simpleMx :: Num a => a -> a -> (Chr, Chr) -> aSource

Construct a simple matrix from match score/mismatch penalty