Description

This module provides a fairly direct representation of the SAM/BAM alignment format, along with an interface to read and write alignments in this format.

The package is based on the C SamTools library available at

http://samtools.sourceforge.net/

and the SAM/BAM file format is described here

http://samtools.sourceforge.net/SAM-1.3.pdf

This package only reads existing alignment files generated by other tools. The meaning of the various flags is actually determined by the program that produced the alignment file.

Synopsis

# Target sequence sets

Information about one target sequence in a SAM alignment set

Constructors

 HeaderSeq Fieldsname :: !ByteStringTarget sequence name len :: !Int64Target sequence lengh

Target sequences from a SAM alignment set

Number of target sequences

Returns the list of target sequences

Returns a target sequence by ID, which is a 0-based index

Returns a target sequence name by ID

# SAM/BAM format alignments

data Bam1 Source #

SAM/BAM format alignment

Just the reference target sequence ID in the target set, or Nothing for an unmapped read

Just the target sequence name, or Nothing for an unmapped read

Just the total length of the target sequence, or Nothing for an unmapped read

Just the 0-based index of the leftmost aligned position on the target sequence, or Nothing for an unmapped read

Is the pair properly aligned (usually based on relative orientation and distance)

Is the read paired and the mate unmapped

Is the fragment's reverse complement aligned to the target

Is the read paired and the mate's reverse complement aligned to the target

Is the fragment from the first read in the template

Is the fragment from the second read in the template

Is the fragment alignment secondary

Did the read fail quality controls

Is the read a technical duplicate

cigars :: Bam1 -> [Cigar] Source #

CIGAR description of the alignment

Name of the query sequence

Just the length of the query sequence, or Nothing when it is unavailable.

Just the query sequence, or Nothing when it is unavailable

Just the query qualities, or Nothing when it is unavailable. These are returned in ASCII format, i.e., q + 33.

Just the target ID of the mate alignment target reference sequence, or Nothing when the mate is unmapped or the read is unpaired.

Just the name of the mate alignment target reference sequence, or Nothing when the mate is unmapped or the read is unpaired.

Just the length of the mate alignment target reference sequence, or Nothing when the mate is unmapped or the read is unpaired.

'Just the 0-based coordinate of the left-most position in the mate alignment on the target, or Nothing when the read is unpaired or the mate is unmapped.

Just the total insert length, or Nothing when the length is unavailable, e.g. because the read is unpaired or the mated read pair do not align in the proper relative orientation on the same strand.

Just the number of mismatches in the alignemnt, or Nothing when this information is not present

Just the number of reported alignments, or Nothing when this information is not present.

Just the match descriptor alignment field, or Nothing when it is absent

Just the requested integer auxiliary field, or Nothing when it is absent

Just the requested single-precision float auxiliary field, or Nothing when it is absent

Just the requested double-precision float auxiliary field, or Nothing when it is absent

Just the requested character auxiliary field, or Nothing when it is absent

Just the requested string auxiliary field, or Nothing when it is absent

auxGet :: AuxGet a => Bam1 -> String -> Maybe a Source #

Just the reference sequence location covered by the alignment. This includes nucleotide positions that are reported to be deleted in the read, but not skipped nucleotide position (typically intronic positions in a spliced alignment). If the reference location is unavailable, e.g. for an unmapped read or for a read with no CIGAR format alignment information, then Nothing.

Just the reference sequence location (as per refSpLoc) on the target reference (as per targetName)

data InHandle Source #

Handle for reading SAM/BAM format alignments

Target sequence set for the alignments

Open a TAM (tab-delimited text) format file with @SQ headers for the target sequence set.

Open a TAM format file with a separate target sequence set index

Open a BAM (binary) format file

Close a SAM/BAM format alignment input handle

Target sequence set data is still available after the file input has been closed.

withTamInFile :: FilePath -> (InHandle -> IO a) -> IO a Source #

Run an IO action using a handle to a TAM format file that will be opened (see openTamInFile) and closed for the action.

withTamInFileWithIndex :: FilePath -> FilePath -> (InHandle -> IO a) -> IO a Source #

As withTamInFile with a separate target sequence index set (see openTamInFileWithIndex)

withBamInFile :: FilePath -> (InHandle -> IO a) -> IO a Source #

As withTamInFile for BAM (binary) format files

Read one alignment from an input handle, or returns Nothing for end-of-file

Read a BAM file as a lazy stream of Bam1 records.

# Writing SAM/BAM format files

data OutHandle Source #

Handle for writing SAM/BAM format alignments

Target sequence set for the alignments

Open a TAM format file with @SQ headers for writing alignments

Open a BAM format file for writing alignments

Close an alignment output handle

put1 :: OutHandle -> Bam1 -> IO () Source #

Write one alignment to an output handle.

There is no validation that the target sequence set of the output handle matches the target sequence set of the alignment.