Read (and write?) the SFF file format used by
Roche/454 sequencing to store flowgram data.
A flowgram is a series of values (intensities) representing homopolymer runs of
A,G,C, and T in a fixed cycle, and usually displayed as a histogram.
The Staden Package contains an io_lib, with a C routine for parsing this format.
According to comments in the sources, the io_lib implementation is based on a file
called getsff.c, which I've been unable to track down.
It is believed that all values are stored big endian.
|The data structure storing the contents of an SFF file (modulo the index)
SFF has a 31-byte common header
Todo: remove items that are derivable (counters, magic, etc)
cheader_lenght points to the first read header.
Also, the format is open to having the index anywhere between reads,
we should really keep count and check for each read. In practice, it
seems to be places after the reads.
The following two fields are considered part of the header, but as
they are static, they are not part of the data structure
magic :: Word32 -- ^ 0x2e736666, i.e. the string .sff
version :: Word32 -- ^ 0x00000001
|Each Read has a fixed read header
|This contains the actual flowgram for a single read.
|test serialization by output'ing the header and first two reads
in an SFF, and the same after a decode + encode cycle.
|Convert a file by decoding it and re-encoding it
This will lose the index (which isn't really necessary)
|The type of flowgram value
|Basic type for quality data. Range 0..255. Typical Phred output is in
the range 6..50, with 20 as the line in the sand separating good from bad.
|Produced by Haddock version 2.4.2|