biostockholm-0.2: Parsing and rendering of Stockholm files (used by Pfam, Rfam and Infernal).

Bio.Sequence.Stockholm.Document

Contents

Description

Take low-level Events and turn them high-level data structures.

Synopsis

Data types

data Stockholm Source

An Stockholm 1.0 formatted file represented in memory.

data Ann d Source

A generic annotation.

Constructors

Ann 

Fields

feature :: !d
 
text :: !ByteString
 

Instances

Typeable1 Ann 
Eq d => Eq (Ann d) 
Ord d => Ord (Ann d) 
Show d => Show (Ann d) 
NFData (Ann d) 

data FileAnnotation Source

Possible file annotations.

Constructors

AC

Accession number: Accession number in form PFxxxxx.version or PBxxxxxx.

ID

Identification: One word name for family.

DE

Definition: Short description of family.

AU

Author: Authors of the entry.

SE

Source of seed: The source suggesting the seed members belong to one family.

GA

Gathering method: Search threshold to build the full alignment.

TC

Trusted Cutoff: Lowest sequence score and domain score of match in the full alignment.

NC

Noise Cutoff: Highest sequence score and domain score of match not in full alignment.

TP

Type: Type of family (presently Family, Domain, Motif or Repeat).

SQ

Sequence: Number of sequences in alignment.

AM

Alignment Method: The order ls and fs hits are aligned to the model to build the full align.

DC

Database Comment: Comment about database reference.

DR

Database Reference: Reference to external database.

RC

Reference Comment: Comment about literature reference.

RN

Reference Number: Reference Number.

RM

Reference Medline: Eight digit medline UI number.

RT

Reference Title: Reference Title.

RA

Reference Author: Reference Author

RL

Reference Location: Journal location.

PI

Previous identifier: Record of all previous ID lines.

KW

Keywords: Keywords.

CC

Comment: Comments.

NE

Pfam accession: Indicates a nested domain.

NL

Location: Location of nested domains - sequence ID, start and end of insert.

F_Other !ByteString

Other file annotation.

data SequenceAnnotation Source

Possible sequence annotations.

Constructors

S_AC

Accession number

S_DE

Description

S_DR

Database reference

OS

Organism (species)

OC

Organism classification (clade, etc.)

LO

Look (Color, etc.)

S_Other !ByteString

Other sequence annotation.

data ColumnAnnotation a Source

Possible column annotations. Phantom type can be InFile or InSeq.

Constructors

SS

Secondary structure.

SA

Surface accessibility.

TM

TransMembrane.

PP

Posterior probability.

LI

LIgand binding.

AS

Active site.

PAS

AS - Pfam predicted.

SAS

AS - from SwissProt.

IN

INtron (in or after).

C_Other !ByteString

Other column annotation.

data InFile Source

Phantom type for ColumnAnnotations of the whole file.

Instances

ClmnFeatureLoc InFile 

data InSeq Source

Phantom type for ColumnAnnotations of a single sequence.

Instances

ClmnFeatureLoc InSeq 

Conduits

parseDoc :: Resource m => Conduit Event m StockholmSource

Conduit that parses Events into documents Stockholm.

renderDoc :: Resource m => Conduit Stockholm m EventSource

Conduit that renders Stockholms into Events.