biohazard-0.6.5: bioinformatics support library

Safe HaskellNone
LanguageHaskell98

Bio.Bam.Evan

Description

This module contains stuff relating to conventions local to MPI EVAN. The code is needed regularly, but it can be harmful when applied to BAM files that follow different conventions. Most importantly, no program should call these functions by default.

Synopsis

Documentation

fixupFlagAbuse :: BamRec -> BamRec Source

Fixes abuse of flags valued 0x800 and 0x1000. We used them for low quality and low complexity, but they have since been redefined. If set, we clear them and store them into the ZD field. Also fixes abuse of the combination of the paired, 1st mate and 2nd mate flags used to indicate merging or trimming. These are canonicalized and stored into the FF field. This function is unsafe on BAM files of unclear origin!

fixupBwaFlags :: BamRec -> BamRec Source

Fixes typical inconsistencies produced by Bwa: sometimes, 'mate unmapped' should be set, and we can see it, because we match the mate's coordinates. Sometimes 'properly paired' should not be set, because one mate in unmapped. This function is generally safe, but needs to be called only on the output of affected (older?) versions of Bwa.

removeWarts :: BamRec -> BamRec Source

Removes syntactic warts from old read names or the read names used in FastQ files.