biohazard-0.6.15: bioinformatics support library

Safe HaskellNone
LanguageHaskell2010

Bio.Iteratee.Bgzf

Description

Handling of BGZF files. Right now, we have an Enumeratee each for

Synopsis

Documentation

data Block Source

One BGZF block: virtual offset and contents. Could also be a block of an uncompressed file, if we want to support indexing of uncompressed BAM or some silliness like that.

Constructors

Block 

decompressBgzfBlocks' :: MonadIO m => Int -> Enumeratee Bytes Block m a Source

Decompress a BGZF stream into a stream of Blocks, np fold parallel.

decompressBgzf :: MonadIO m => Enumeratee Bytes Bytes m a Source

Decompress a BGZF stream into a stream of Bytess.

decompressPlain :: MonadIO m => Enumeratee Bytes Block m a Source

Decompresses a plain file. What's actually happening is that the offset in the input stream is tracked and added to the Bytess giving Blocks. This results in the same interface as decompressing actual Bgzf.

maxBlockSize :: Int Source

Maximum block size for Bgzf: 64k with some room for headers and uncompressible stuff

bgzfEofMarker :: Bytes Source

The EOF marker for BGZF files. This is just an empty string compressed as BGZF. Appended to BAM files to indicate their end.

liftBlock :: Monad m => Iteratee Bytes m a -> Iteratee Block m a Source

Runs an Iteratee for Bytess when decompressing BGZF. Adds internal bookkeeping.

getOffset :: Iteratee Block m FileOffset Source

Get the current virtual offset. The virtual address in a BGZF stream contains the offset of the current block in the upper 48 bits and the current offset into that block in the lower 16 bits. This scheme is compatible with the way BAM files are indexed.

isBgzf :: Monad m => Iteratee Bytes m Bool Source

Tests whether a stream is in BGZF format. Does not consume any input.

isGzip :: Monad m => Iteratee Bytes m Bool Source

Tests whether a stream is in GZip format. Also returns True on a Bgzf stream, which is technically a special case of GZip.

parMapChunksIO :: (MonadIO m, Nullable s) => Int -> (s -> IO t) -> Enumeratee s t m a Source

Parallel map of an IO action over the elements of a stream

This Enumeratee applies an IO action to every chunk of the input stream. These IO actions are run asynchronously in a limited parallel way. Don't forget to evaluate

compressBgzf :: MonadIO m => Enumeratee BgzfChunk Bytes m a Source

Like compressBgzf', with sensible defaults.

compressBgzf' :: MonadIO m => CompressParams -> Enumeratee BgzfChunk Bytes m a Source

Compresses a stream of Bytess into a stream of BGZF blocks, in parallel

data CompressParams Source

Constructors

CompressParams 

Fields

compression_level :: Int
 
queue_depth :: Int
 

Instances

compressChunk :: Int -> Ptr Word8 -> CUInt -> IO Bytes Source