BiobaseXNA- Efficient RNA/DNA representations




The primary structure: interface to efficient encoding of RNA and DNA sequences. The design aims toward the vector library and repa. In particular, everything is strict; if you want to stream full genomes, use text or lazy bytestrings instead and cast to Biobase.Primary definitions only at the last moment.

NOTE individual nucleotides are encoded is Ints internally without any tagging. This means that we have no way of deciding if we are dealing with RNA or DNA on this level.


Convert different types of sequence representations to the internal

class MkPrimary a whereSource

Given a sequence of nucleotides encoded in some text-form, create a Nuc-based unboxed vector.


mkPrimary :: a -> PrimarySource

Efficient nucleotide encoding

newtype Nuc Source




unNuc :: Int


Bounded Nuc

The bounded instance from GHC proper. Captures all defined symbols.

Enum Nuc


Eq Nuc 
Ord Nuc 
Read Nuc

Human-readable Read instance.

Show Nuc

Human-readable Show instance.

Ix Nuc 
Prim Nuc 
Unbox Nuc 
Bounds Nuc

Special bounds for energy / score arrays

IsostericityLookup ExtPair

For extended basepairs, we take the default mapping and go from there.

TODO inClass missing

IsostericityLookup Pair

Normal basepairs are assumed to have cWW basepairing.

TODO inClass missing

Vector Vector Nuc 
MVector MVector Nuc 
MkPrimary [Nuc] 
(Shape sh, Show sh) => Shape (:. sh Nuc) 
MkViennaPair (Nuc, Nuc) 

mkNuc :: Char -> NucSource

Translate between Chars and Nucs.

Instances of different type classes

instances for Nuc

Instances for MkPrimary