bio-0.5: A bioinformatics library

Bio.Alignment.BlastFlat

Contents

Description

This module implements a "flattened" data structure for Blast hits, as opposed to the hierarchical structure in Bio.Alignment.BlastData.

The flat data type is useful in many cases where it is more natural to see the result as a set of rows (e.g. for insertaion in a database).

It would probably be more (memory-) efficient to go the other way (i.e. from flat to hierarchical), as passing the current, partially built BlastFlat object down the stream of results and stamping out a stream of completed ones. (See Bio.Alignment.BlastXML.breaks for this week's most cumbersome use of parallelism to avoid the memory issue.)

Synopsis

The BlastFlat data type

data BlastFlat Source

The BlastFlat data structure contains information about a single match

Constructors

BlastFlat 

Fields

query :: !SeqId
 
qlength :: !Int
 
subject :: !SeqId
 
slength :: !Int
 
bits :: !Double
 
e_val :: !Double
 
identity :: (Int, Int)
 
q_from :: !Int
 
q_to :: !Int
 
h_from :: !Int
 
h_to :: !Int
 
aux :: !Aux
 

Read XML format

Convert from hierarchical to flat structure

flatten :: [BlastRecord] -> [BlastFlat]Source

Convert BlastRecords into BlastFlats (representing a depth-first traversal of the BlastRecord structure.)

Re-exports from the hierarchical module (Bio.Alignment.BlastData)

data BlastRecord Source

Each query sequence generates a BlastRecord

Instances

data Aux Source

The Aux field in the BLAST output includes match information that depends on the BLAST flavor (blastn, blastx, or blastp). This data structure captures those variations.

Constructors

Strands !Strand !Strand

blastn

Frame !Strand !Int

blastx

Instances

data Strand Source

The Strand indicates the direction of the match, i.e. the plain sequence or its reverse complement.

Constructors

Plus 
Minus