bio-0.3.3.2: A bioinformatics librarySource codeContentsIndex
Bio.Alignment.BlastData
Description

This module implements a hierarchical data structure for BLAST results, there is an alternative flat structure in the Bio.Alignment.BlastFlat module.

BLAST is a tool for searching in (biological) sequences for similarity. This library is tested against NCBI-blast version 2.2.14. There exist several independent versions of BLAST, so expect some incompatbilities if you're using a different BLAST version.

For parsing BLAST results, the XML format (blastall -m 7) is by far the most robust choice, and is implemented in the Bio.Alignment.BlastXML module.

The format is straightforward (and non-recursive). For more information on BLAST, check http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/information3.html

Synopsis
type SeqId = ByteString
data Strand
= Plus
| Minus
data Aux
= Strands !Strand !Strand
| Frame !Strand !Int
data BlastResult = BlastResult {
blastprogram :: !ByteString
blastversion :: !ByteString
blastdate :: !ByteString
blastreferences :: !ByteString
database :: !ByteString
dbsequences :: !Integer
dbchars :: !Integer
results :: [BlastRecord]
}
data BlastRecord = BlastRecord {
query :: !SeqId
qlength :: !Int
hits :: [BlastHit]
}
data BlastHit = BlastHit {
subject :: !SeqId
slength :: !Int
matches :: [BlastMatch]
}
data BlastMatch = BlastMatch {
bits :: !Double
e_val :: !Double
identity :: (Int, Int)
q_from :: !Int
q_to :: !Int
h_from :: !Int
h_to :: !Int
aux :: !Aux
}
Documentation
type SeqId = ByteStringSource
The sequence id, i.e. the first word of the header field.
data Strand Source
The Strand indicates the direction of the match, i.e. the plain sequence or its reverse complement.
Constructors
Plus
Minus
show/hide Instances
data Aux Source
The Aux field in the BLAST output includes match information that depends on the BLAST flavor (blastn, blastx, or blastp). This data structure captures those variations.
Constructors
Strands !Strand !Strandblastn
Frame !Strand !Intblastx
show/hide Instances
data BlastResult Source
A BlastResult is the root of the hierarchy.
Constructors
BlastResult
blastprogram :: !ByteString
blastversion :: !ByteString
blastdate :: !ByteString
blastreferences :: !ByteString
database :: !ByteString
dbsequences :: !Integer
dbchars :: !Integer
results :: [BlastRecord]
show/hide Instances
data BlastRecord Source
Each query sequence generates a BlastRecord
Constructors
BlastRecord
query :: !SeqId
qlength :: !Int
hits :: [BlastHit]
show/hide Instances
data BlastHit Source
Each match between a query and a target sequence (or subject) is a BlastHit.
Constructors
BlastHit
subject :: !SeqId
slength :: !Int
matches :: [BlastMatch]
show/hide Instances
data BlastMatch Source
A BlastHit may contain multiple separate matches (typcially when an indel causes a frameshift that blastx is unable to bridge).
Constructors
BlastMatch
bits :: !Double
e_val :: !Double
identity :: (Int, Int)
q_from :: !Int
q_to :: !Int
h_from :: !Int
h_to :: !Int
aux :: !Aux
show/hide Instances
Produced by Haddock version 2.4.2