biohazard-1.1.0: bioinformatics support library

Safe HaskellNone
LanguageHaskell2010

Bio.Iteratee.Bgzf

Description

Handling of BGZF files. Right now, we have an Enumeratee each for input and output. The input iteratee can optionally supply virtual file offsets, so that seeking is possible.

Synopsis

Documentation

data Block Source #

One BGZF block: virtual offset and contents. Could also be a block of an uncompressed file, if we want to support indexing of uncompressed BAM or some silliness like that.

Constructors

Block 
Instances
Semigroup Block Source # 
Instance details

Defined in Bio.Iteratee.Bgzf

Monoid Block Source # 
Instance details

Defined in Bio.Iteratee.Bgzf

Nullable Block Source # 
Instance details

Defined in Bio.Iteratee.Bgzf

Methods

nullC :: Block -> Bool Source #

NullPoint Block Source # 
Instance details

Defined in Bio.Iteratee.Bgzf

Methods

emptyP :: Block Source #

decompressBgzfBlocks' :: MonadIO m => Int -> Enumeratee Bytes Block m a Source #

Decompress a BGZF stream into a stream of Blocks, np fold parallel.

decompressBgzf :: MonadIO m => Enumeratee Bytes Bytes m a Source #

Decompress a BGZF stream into a stream of Bytess.

decompressPlain :: MonadIO m => Enumeratee Bytes Block m a Source #

Decompresses a plain file. What's actually happening is that the offset in the input stream is tracked and added to the Bytess giving Blocks. This results in the same interface as decompressing actual Bgzf.

maxBlockSize :: Int Source #

Maximum block size for Bgzf: 64k with some room for headers and uncompressible stuff

bgzfEofMarker :: Bytes Source #

The EOF marker for BGZF files. This is just an empty string compressed as BGZF. Appended to BAM files to indicate their end.

liftBlock :: Monad m => Iteratee Bytes m a -> Iteratee Block m a Source #

Runs an Iteratee for Bytess when decompressing BGZF. Adds internal bookkeeping.

getOffset :: Iteratee Block m FileOffset Source #

Get the current virtual offset. The virtual address in a BGZF stream contains the offset of the current block in the upper 48 bits and the current offset into that block in the lower 16 bits. This scheme is compatible with the way BAM files are indexed.

isBgzf :: Monad m => Iteratee Bytes m Bool Source #

Tests whether a stream is in BGZF format. Does not consume any input.

isGzip :: Monad m => Iteratee Bytes m Bool Source #

Tests whether a stream is in GZip format. Also returns True on a Bgzf stream, which is technically a special case of GZip.

parMapChunksIO :: (MonadIO m, Nullable s) => Int -> (s -> IO t) -> Enumeratee s t m a Source #

Parallel map of an IO action over the elements of a stream

This Enumeratee applies an IO action to every chunk of the input stream. These IO actions are run asynchronously in a limited parallel way. Don't forget to evaluate

compressBgzf :: MonadIO m => Enumeratee BgzfChunk Bytes m a Source #

Like compressBgzf', with sensible defaults.

compressBgzf' :: MonadIO m => CompressParams -> Enumeratee BgzfChunk Bytes m a Source #

Compresses a stream of Bytess into a stream of BGZF blocks, in parallel