Data structures for manipulating (biological) sequences.
Generally supports both nucleotide and protein sequences, some functions,
like revcompl, only makes sense for nucleotides.
|A sequence is a header, sequence data itself, and optional quality data.
All items are lazy bytestrings. The Offset type can be used for indexing.
|A sequence consists of a header, the sequence data itself, and optional quality data.
|An offset, index, or length of a SeqData
|The basic data type used in Sequences
|Quality data is normally associated with nucleotide sequences
|Basic type for quality data. Range 0..255. Typical Phred output is in
the range 6..50, with 20 as the line in the sand separating good from bad.
|Quality data is a Qual vector, currently implemented as a ByteString.
|Read the character at the specified position in the sequence.
|Return sequence length.
|Return sequence label (first word of header)
|Return full header.
|Return the sequence data.
|Check whether the sequence has associated quality data.
|Return the quality data, or error if none exist. Use hasqual if in doubt.
|Adding information to header
|Modify the header by appending text, or by replacing
all but the sequence label (i.e. first word).
|Converting to and from [Char]
|Convert a String to SeqData
|Convert a SeqData to a String
|Nucleotide sequences contain the alphabet [A,C,G,T].
IUPAC specifies an extended nucleotide alphabet with wildcards, but
it is not supported at this point.
|Complement a single character. I.e. identify the nucleotide it
can hybridize with. Note that for multiple nucleotides, you usually
want the reverse complement (see revcompl for that).
|Calculate the reverse complement.
This is only relevant for the nucleotide alphabet,
and it leaves other characters unmodified.
|Proteins are chains of amino acids, represented by the IUPAC alphabet.
|Translate a nucleotide sequence into the corresponding protein
sequence. This works rather blindly, with no attempt to identify ORFs
or otherwise QA the result.
|Convert a sequence in IUPAC format to a list of amino acids.
|Convert a list of amino acids to a sequence in IUPAC format.
|Produced by Haddock version 2.4.2|