biohazard-1.0.0: bioinformatics support library

Safe HaskellNone




newtype BamKey Source #

Exactly two characters, for the "named" fields in bam.


BamKey Word16 

data BamSQ Source #




Eq BamSQ Source # 


(==) :: BamSQ -> BamSQ -> Bool #

(/=) :: BamSQ -> BamSQ -> Bool #

Show BamSQ Source # 


showsPrec :: Int -> BamSQ -> ShowS #

show :: BamSQ -> String #

showList :: [BamSQ] -> ShowS #

data BamSorting Source #

Possible sorting orders from bam header. Thanks to samtools, which doesn't declare sorted files properly, we have to have the stupid Unknown state, too.

newtype Refseq Source #

Reference sequence in Bam Bam enumerates the reference sequences and then sorts by index. We need to track that index if we want to reproduce the sorting order.




invalidRefseq :: Refseq Source #

The invalid Refseq. Bam uses this value to encode a missing reference sequence.

isValidRefseq :: Refseq -> Bool Source #

Tests whether a reference sequence is valid. Returns true unless the the argument equals invalidRefseq.

invalidPos :: Int Source #

The invalid position. Bam uses this value to encode a missing position.

isValidPos :: Int -> Bool Source #

Tests whether a position is valid. Returns true unless the the argument equals invalidPos.

type Refs = Seq BamSQ Source #

A list of reference sequences.

noRefs :: Refs Source #

The empty list of references. Needed for BAM files that don't really store alignments.

compareNames :: Seqid -> Seqid -> Ordering Source #

Compares two sequence names the way samtools does. samtools sorts by "strnum_cmp":

  • if both strings start with a digit, parse the initial sequence of digits and compare numerically, if equal, continue behind the numbers
  • else compare the first characters (possibly NUL), if equal continue behind them
  • else both strings ended and the shorter one counts as smaller (and that part is stupid)

distinctBin :: Int -> Int -> Int Source #

Computes the "distinct bin" according to the BAM binning scheme. If an alignment starts at pos and its CIGAR implies a length of len on the reference, then it goes into bin distinctBin pos len.

data MdOp Source #


showMd :: [MdOp] -> Bytes Source #

Normalizes a series of MdOps and encodes them in the way BAM and SAM expect it.