Safe Haskell	Safe-Infered

Bio.Sequence.Fasta

Contents

Reading and writing plain FASTA files
Counting sequences in a FASTA file
Helper function for reading your own sequences

Description

This module incorporates functionality for reading and writing sequence data in the Fasta format. Each sequence consists of a header (with a > prefix) and a set of lines containing the sequence data.

As Fasta is used for both amino acids and nucleotides, the resulting Sequences are type-tagged with Unknown. If you know the type of sequence you are reading, use castToAmino or castToNuc.

Synopsis

Documentation

data Sequence Source

Constructors

Seq SeqLabel SeqData (Maybe QualData)

Instances

Eq Sequence
Show Sequence
BioSeq Sequence

Reading and writing plain FASTA files

readFasta :: FilePath -> IO [Sequence]Source

Lazily read sequences from a FASTA-formatted file

writeFasta :: FilePath -> [Sequence] -> IO ()Source

Write sequences to a FASTA-formatted file. Line length is 60.

hReadFasta :: Handle -> IO [Sequence]Source

Lazily read sequence from handle

hWriteFasta :: Handle -> [Sequence] -> IO ()Source

Write sequences in FASTA format to a handle.

Counting sequences in a FASTA file

countSeqs :: FilePath -> IO Int Source

Helper function for reading your own sequences

mkSeqs :: [ByteString] -> [Sequence]Source

Convert a list of FASTA-formatted lines into a list of sequences. Blank lines are ignored. Comment lines start with are allowed between sequences (and ignored). Lines starting with > initiate a new sequence.