Safe Haskell | None |
---|---|
Language | Haskell2010 |
Common data types used everywhere. This module is a collection of very basic "bioinformatics" data types that are simple, but don't make sense to define over and over.
Synopsis
- newtype Nucleotide = N {}
- newtype Nucleotides = Ns {}
- newtype Qual = Q {}
- toQual :: (Floating a, RealFrac a) => a -> Qual
- fromQual :: Qual -> Double
- fromQualRaised :: Double -> Qual -> Double
- probToQual :: (Floating a, RealFrac a) => Prob' a -> Qual
- newtype Prob' a = Pr {
- unPr :: a
- type Prob = Prob' Double
- toProb :: Floating a => a -> Prob' a
- fromProb :: Floating a => Prob' a -> a
- qualToProb :: Floating a => Qual -> Prob' a
- pow :: Num a => Prob' a -> a -> Prob' a
- data Pair a b = !a :!: !b
- data Word8
- nucA :: Nucleotide
- nucC :: Nucleotide
- nucG :: Nucleotide
- nucT :: Nucleotide
- nucsA :: Nucleotides
- nucsC :: Nucleotides
- nucsG :: Nucleotides
- nucsT :: Nucleotides
- nucsN :: Nucleotides
- gap :: Nucleotides
- toNucleotide :: Char -> Nucleotide
- toNucleotides :: Char -> Nucleotides
- nucToNucs :: Nucleotide -> Nucleotides
- showNucleotide :: Nucleotide -> Char
- showNucleotides :: Nucleotides -> Char
- isGap :: Nucleotides -> Bool
- isBase :: Nucleotides -> Bool
- isProperBase :: Nucleotides -> Bool
- properBases :: [Nucleotides]
- compl :: Nucleotide -> Nucleotide
- compls :: Nucleotides -> Nucleotides
- data Position = Pos {
- p_seq :: !ByteString
- p_start :: !Int
- shiftPosition :: Int -> Position -> Position
- p_is_reverse :: Position -> Bool
- data Range = Range {}
- shiftRange :: Int -> Range -> Range
- reverseRange :: Range -> Range
- extendRange :: Int -> Range -> Range
- insideRange :: Range -> Range -> Range
- wrapRange :: Int -> Range -> Range
Documentation
newtype Nucleotide Source #
A nucleotide base. We only represent A,C,G,T. The contained
Word8
ist guaranteed to be 0..3.
Instances
newtype Nucleotides Source #
A nucleotide base in an alignment. Experience says we're dealing with Ns and gaps all the type, so purity be damned, they are included as if they were real bases.
To allow Nucleotides
s to be unpacked and incorporated into
containers, we choose to represent them the same way as the BAM file
format: as a 4 bit wide field. Gaps are encoded as 0 where they
make sense, N is 15. The contained Word8
is guaranteed to be
0..15.
Instances
Qualities are stored in deciban, also known as the Phred scale. To
represent a value p
, we store -10 * log_10 p
. Operations work
directly on the "Phred" value, as the name suggests. The same goes
for the Ord
instance: greater quality means higher "Phred"
score, meand lower error probability.
Instances
Bounded Qual Source # | |
Eq Qual Source # | |
Ord Qual Source # | |
Show Qual Source # | |
Storable Qual Source # | |
Defined in Bio.Base sizeOf :: Qual -> Int Source # alignment :: Qual -> Int Source # peekElemOff :: Ptr Qual -> Int -> IO Qual Source # pokeElemOff :: Ptr Qual -> Int -> Qual -> IO () Source # peekByteOff :: Ptr b -> Int -> IO Qual Source # pokeByteOff :: Ptr b -> Int -> Qual -> IO () Source # |
A positive floating point value stored in log domain. We store the
natural logarithm (makes computation easier), but allow conversions
to the familiar "Phred" scale used for Qual
values.
Instances
Eq a => Eq (Prob' a) Source # | |
(Floating a, Fractional a, Ord a) => Fractional (Prob' a) Source # | |
(Floating a, Ord a) => Num (Prob' a) Source # | |
Defined in Bio.Base | |
Ord a => Ord (Prob' a) Source # | |
RealFloat a => Show (Prob' a) Source # | |
Storable a => Storable (Prob' a) Source # | |
Defined in Bio.Base sizeOf :: Prob' a -> Int Source # alignment :: Prob' a -> Int Source # peekElemOff :: Ptr (Prob' a) -> Int -> IO (Prob' a) Source # pokeElemOff :: Ptr (Prob' a) -> Int -> Prob' a -> IO () Source # peekByteOff :: Ptr b -> Int -> IO (Prob' a) Source # pokeByteOff :: Ptr b -> Int -> Prob' a -> IO () Source # |
A strict pair.
!a :!: !b infixl 2 |
Instances
(Bounded a, Bounded b) => Bounded (Pair a b) Source # | |
(Eq a, Eq b) => Eq (Pair a b) Source # | |
(Ord a, Ord b) => Ord (Pair a b) Source # | |
(Read a, Read b) => Read (Pair a b) Source # | |
(Show a, Show b) => Show (Pair a b) Source # | |
(Ix a, Ix b) => Ix (Pair a b) Source # | |
Defined in Bio.Base range :: (Pair a b, Pair a b) -> [Pair a b] Source # index :: (Pair a b, Pair a b) -> Pair a b -> Int Source # unsafeIndex :: (Pair a b, Pair a b) -> Pair a b -> Int inRange :: (Pair a b, Pair a b) -> Pair a b -> Bool Source # rangeSize :: (Pair a b, Pair a b) -> Int Source # unsafeRangeSize :: (Pair a b, Pair a b) -> Int |
8-bit unsigned integer type
Instances
Bounded Word8 | Since: base-2.1 |
Enum Word8 | Since: base-2.1 |
Defined in GHC.Word succ :: Word8 -> Word8 Source # pred :: Word8 -> Word8 Source # toEnum :: Int -> Word8 Source # fromEnum :: Word8 -> Int Source # enumFrom :: Word8 -> [Word8] Source # enumFromThen :: Word8 -> Word8 -> [Word8] Source # enumFromTo :: Word8 -> Word8 -> [Word8] Source # enumFromThenTo :: Word8 -> Word8 -> Word8 -> [Word8] Source # | |
Eq Word8 | Since: base-2.1 |
Integral Word8 | Since: base-2.1 |
Defined in GHC.Word | |
Data Word8 | Since: base-4.0.0.0 |
Defined in Data.Data gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> Word8 -> c Word8 Source # gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c Word8 Source # toConstr :: Word8 -> Constr Source # dataTypeOf :: Word8 -> DataType Source # dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c Word8) Source # dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c Word8) Source # gmapT :: (forall b. Data b => b -> b) -> Word8 -> Word8 Source # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> Word8 -> r Source # gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> Word8 -> r Source # gmapQ :: (forall d. Data d => d -> u) -> Word8 -> [u] Source # gmapQi :: Int -> (forall d. Data d => d -> u) -> Word8 -> u Source # gmapM :: Monad m => (forall d. Data d => d -> m d) -> Word8 -> m Word8 Source # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> Word8 -> m Word8 Source # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> Word8 -> m Word8 Source # | |
Num Word8 | Since: base-2.1 |
Ord Word8 | Since: base-2.1 |
Read Word8 | Since: base-2.1 |
Real Word8 | Since: base-2.1 |
Show Word8 | Since: base-2.1 |
Ix Word8 | Since: base-2.1 |
Lift Word8 | |
PrintfArg Word8 | Since: base-2.1 |
Defined in Text.Printf formatArg :: Word8 -> FieldFormatter Source # parseFormat :: Word8 -> ModifierParser Source # | |
Storable Word8 | Since: base-2.1 |
Defined in Foreign.Storable sizeOf :: Word8 -> Int Source # alignment :: Word8 -> Int Source # peekElemOff :: Ptr Word8 -> Int -> IO Word8 Source # pokeElemOff :: Ptr Word8 -> Int -> Word8 -> IO () Source # peekByteOff :: Ptr b -> Int -> IO Word8 Source # pokeByteOff :: Ptr b -> Int -> Word8 -> IO () Source # | |
Bits Word8 | Since: base-2.1 |
Defined in GHC.Word (.&.) :: Word8 -> Word8 -> Word8 Source # (.|.) :: Word8 -> Word8 -> Word8 Source # xor :: Word8 -> Word8 -> Word8 Source # complement :: Word8 -> Word8 Source # shift :: Word8 -> Int -> Word8 Source # rotate :: Word8 -> Int -> Word8 Source # setBit :: Word8 -> Int -> Word8 Source # clearBit :: Word8 -> Int -> Word8 Source # complementBit :: Word8 -> Int -> Word8 Source # testBit :: Word8 -> Int -> Bool Source # bitSizeMaybe :: Word8 -> Maybe Int Source # bitSize :: Word8 -> Int Source # isSigned :: Word8 -> Bool Source # shiftL :: Word8 -> Int -> Word8 Source # unsafeShiftL :: Word8 -> Int -> Word8 Source # shiftR :: Word8 -> Int -> Word8 Source # unsafeShiftR :: Word8 -> Int -> Word8 Source # rotateL :: Word8 -> Int -> Word8 Source # | |
FiniteBits Word8 | Since: base-4.6.0.0 |
Hashable Word8 | |
Defined in Data.Hashable.Class | |
Unbox Word8 | |
Defined in Data.Vector.Unboxed.Base | |
Prim Word8 | |
Defined in Data.Primitive.Types alignment# :: Word8 -> Int# indexByteArray# :: ByteArray# -> Int# -> Word8 readByteArray# :: MutableByteArray# s -> Int# -> State# s -> (#State# s, Word8#) writeByteArray# :: MutableByteArray# s -> Int# -> Word8 -> State# s -> State# s setByteArray# :: MutableByteArray# s -> Int# -> Int# -> Word8 -> State# s -> State# s indexOffAddr# :: Addr# -> Int# -> Word8 readOffAddr# :: Addr# -> Int# -> State# s -> (#State# s, Word8#) writeOffAddr# :: Addr# -> Int# -> Word8 -> State# s -> State# s setOffAddr# :: Addr# -> Int# -> Int# -> Word8 -> State# s -> State# s | |
Vector Vector Word8 | |
Defined in Data.Vector.Unboxed.Base basicUnsafeFreeze :: PrimMonad m => Mutable Vector (PrimState m) Word8 -> m (Vector Word8) basicUnsafeThaw :: PrimMonad m => Vector Word8 -> m (Mutable Vector (PrimState m) Word8) basicLength :: Vector Word8 -> Int basicUnsafeSlice :: Int -> Int -> Vector Word8 -> Vector Word8 basicUnsafeIndexM :: Monad m => Vector Word8 -> Int -> m Word8 basicUnsafeCopy :: PrimMonad m => Mutable Vector (PrimState m) Word8 -> Vector Word8 -> m () | |
MVector MVector Word8 | |
Defined in Data.Vector.Unboxed.Base basicLength :: MVector s Word8 -> Int basicUnsafeSlice :: Int -> Int -> MVector s Word8 -> MVector s Word8 basicOverlaps :: MVector s Word8 -> MVector s Word8 -> Bool basicUnsafeNew :: PrimMonad m => Int -> m (MVector (PrimState m) Word8) basicInitialize :: PrimMonad m => MVector (PrimState m) Word8 -> m () basicUnsafeReplicate :: PrimMonad m => Int -> Word8 -> m (MVector (PrimState m) Word8) basicUnsafeRead :: PrimMonad m => MVector (PrimState m) Word8 -> Int -> m Word8 basicUnsafeWrite :: PrimMonad m => MVector (PrimState m) Word8 -> Int -> Word8 -> m () basicClear :: PrimMonad m => MVector (PrimState m) Word8 -> m () basicSet :: PrimMonad m => MVector (PrimState m) Word8 -> Word8 -> m () basicUnsafeCopy :: PrimMonad m => MVector (PrimState m) Word8 -> MVector (PrimState m) Word8 -> m () basicUnsafeMove :: PrimMonad m => MVector (PrimState m) Word8 -> MVector (PrimState m) Word8 -> m () basicUnsafeGrow :: PrimMonad m => MVector (PrimState m) Word8 -> Int -> m (MVector (PrimState m) Word8) | |
data Vector Word8 | |
Defined in Data.Vector.Unboxed.Base | |
data MVector s Word8 | |
Defined in Data.Vector.Unboxed.Base |
nucA :: Nucleotide Source #
nucC :: Nucleotide Source #
nucG :: Nucleotide Source #
nucT :: Nucleotide Source #
nucsA :: Nucleotides Source #
nucsC :: Nucleotides Source #
nucsG :: Nucleotides Source #
nucsT :: Nucleotides Source #
nucsN :: Nucleotides Source #
gap :: Nucleotides Source #
toNucleotide :: Char -> Nucleotide Source #
Converts a character into a Nucleotides
.
The usual codes for A,C,G,T and U are understood, -
and .
become
gaps and everything else is an N.
toNucleotides :: Char -> Nucleotides Source #
Converts a character into a Nucleotides
.
The usual codes for A,C,G,T and U are understood, -
and .
become
gaps and everything else is an N.
nucToNucs :: Nucleotide -> Nucleotides Source #
showNucleotide :: Nucleotide -> Char Source #
showNucleotides :: Nucleotides -> Char Source #
isGap :: Nucleotides -> Bool Source #
Tests if a Nucleotides
is a gap.
Returns true only for the gap.
isBase :: Nucleotides -> Bool Source #
Tests if a Nucleotides
is a base.
Returns True
for everything but gaps.
isProperBase :: Nucleotides -> Bool Source #
Tests if a Nucleotides
is a proper base.
Returns True
for A,C,G,T only.
properBases :: [Nucleotides] Source #
compl :: Nucleotide -> Nucleotide Source #
Complements a Nucleotides.
compls :: Nucleotides -> Nucleotides Source #
Complements a Nucleotides.
Coordinates in a genome. The position is zero-based, no questions about it. Think of the position as pointing to the crack between two bases: looking forward you see the next base to the right, looking in the reverse direction you see the complement of the first base to the left.
To encode the strand, we (virtually) reverse-complement any sequence and prepend it to the normal one. That way, reversed coordinates have a negative sign and automatically make sense. Position 0 could either be the beginning of the sequence or the end on the reverse strand... that ambiguity shouldn't really matter.
Pos | |
|
Instances
Eq Position Source # | |
Ord Position Source # | |
Show Position Source # | |
shiftPosition :: Int -> Position -> Position Source #
Moves a Position
. The position is moved forward according to the
strand, negative indexes move backward accordingly.
p_is_reverse :: Position -> Bool Source #
Ranges in genomes
We combine a position with a length. In 'Range pos len', pos
is
always the start of a stretch of length len
. Positions therefore
move in the opposite direction on the reverse strand. To get the
same stretch on the reverse strand, shift r_pos by r_length, then
reverse direction (or call reverseRange).
reverseRange :: Range -> Range Source #
Reverses a Range
to give the same Range
on the opposite strand.
extendRange :: Int -> Range -> Range Source #
Extends a range. The length of the range is simply increased.
insideRange :: Range -> Range -> Range Source #
Expands a subrange.
(range1
interprets insideRange
range2)range1
as a subrange of
range2
and computes its absolute coordinates. The sequence name of
range1
is ignored.