bio-0.5.3: A bioinformatics library

Safe HaskellSafe-Inferred

Bio.Sequence.GeneOntology

Contents

Description

GeneOntology - parse and index Gene Ontology Annotations In particular, the file 'gene_association.goa_uniprot' that contains links between GO terms and UniProt accessions.

Synopsis

Basic data types

newtype GoTerm Source

A GO term is a positive integer

Constructors

GO Int 

data GoDef Source

A GoDef maps a GoTerm to a description and a GoClass.

Constructors

GoDef !GoTerm !ByteString !GoClass 

Instances

Reading the OBO format

type GoHierarchy = [(GoDef, [GoTerm])]Source

A list of Go definitions, with pointers to parent nodes. Read from the .obo file. The user may construct the explicit hierachy by storing these in a Map or similar

readObo :: FilePath -> IO GoHierarchySource

Read the GO hierarchy from the obo file. Note that this is not quite a tree structure.

Reading 'terms and ids'

readTerms :: FilePath -> IO [GoDef]Source

Read GO term definitions, from the GO.terms_and_ids file

Reading UniProt associations

data Annotation Source

A GOA annotation, containing a UniProt identifier, a GoTerm and an evidence code.

Instances

type UniProtAcc = ByteStringSource

A UniProt identifier (short string of capitals and numbers).

data GoClass Source

Constructors

Func 
Proc 
Comp 

Instances

data EvidenceCode Source

Evidence codes describe the type of support for an annotation http://www.geneontology.org/GO.evidence.shtml

Constructors

IC

Inferred by Curator

IDA

Inferred from Direct Assay

IEA

Inferred from Electronic Annotation

IEP

Inferred from Expression Pattern

IGC

Inferred from Genomic Context

IGI

Inferred from Genetic Interaction

IMP

Inferred from Mutant Phenotype

IPI

Inferred from Physical Interaction

ISS

Inferred from Sequence or Structural Similarity

NAS

Non-traceable Author Statement

ND

No biological Data available

RCA

Inferred from Reviewed Computational Analysis

TAS

Traceable Author Statement

NR

Not Recorded

readGOA :: FilePath -> IO [Annotation]Source

Read the goa_uniprot file (warning: this one is huge!)

isCurated :: EvidenceCode -> BoolSource

The vast majority of GOA data is IEA, while the most reliable information is manually curated. Filtering on this is useful to keep data set sizes manageable, too.

Utility stuff