úÎT©MíV      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTU Safe-Inferred;Read names encode various information, as per this struct.  VW XYZ[\]^_ VW XYZ[\]^_  VW XYZ[\]^_None$`]This allows us to decode the constant parts of the read header for verifying its correcness. aRRSFF wraps an SFF to provide an instance of Binary with some more error checking. 5This contains the actual flowgram for a single read. CEach Read has a fixed read header, containing various information.  SFF has a 31-byte common header ?The format is open to having the index anywhere between reads, I we should really keep count and check for each read. In practice, it ' seems to be places after the reads. CThe following two fields are considered part of the header, but as < they are static, they are not part of the data structure  8 magic :: Word32 -- 0x2e736666, i.e. the string ".sff" ' version :: Word32 -- 0x00000001 Points to a text(?) section &JThe data structure storing the contents of an SFF file (modulo the index) )The type of flowgram value b1An SFF file always start with this magic number. cVersion is always 1. *Read an SFF file. +Trim a  . limiting the number of flows. If writing to ) an SFF file, make sure you update the  accordingly.  See examples/Flx.hs for how to use this. ,=Trim a read to specific sequence position, inclusive bounds. -.Trim a read according to clipping information .Trim adapters from a read /%Trim the key (i.e. first four bases) 0?Convert a flow position to the corresponding sequence position 1?Convert a sequence position to the corresponding flow position 23Read an SFF file, but be resilient against errors. 3 Write an & to the specified file name 4 Write an &- to the specified file name, but go back and ? update the read count. Useful if you want to output a lazy  stream of  )s. Returns the number of reads written. dWrite  s to a file handle. 5test serialization by output'$ing the header and first two reads : in an SFF, and the same after a decode + encode cycle. 61Convert a file by decoding it and re-encoding it & This will lose the index (which isn't really necessary) e!Generalized function for padding f%Generalized function to skip padding 7Helper function for decoding a  . 8A ReadBlock can'9t be an instance of Binary directly, since it depends on & information from the CommonHeader. 96Unpack the flow_data field into a list of flow values :SPack a list of flows into the corresponding binary structure (the flow_data field) ;'Helper function to access the flowgram <5Extract the sequence with masked bases in lower case =9Extract the index as absolute coordinates, not relative. gEnsure that the header we'&re decoding matches our expectations. h Decode a &, verifying that the data make sense. i>Wrapper for ReadBlocks since they need additional information V`jklmnopqrast  !"#$%&'uv()bc*wx+,-./01234d56ef789:;<=ghyz{|}~€i‚ƒ>  !"#$%&'()*+,-./0123456789:;<=>&' !"#$% *342-,/.10+56;<=:9)( 872` jklmnopqrast    !"#$%&'uv()bc*wx+,-./01234d56ef789:;<=ghyz{|}~€i‚ƒNone>?TrimFilters modify the read, typically trimming it for quality ?GDiscardFilters determine whether a read is to be retained or discarded @&This filter discards empty sequences. ADiscard sequences that don'7t have the given key tag (typically TCAG) at the start  of the read. B  2.2.1.2 The dots< filter discards sequences where the last positive flow is Q before flow 84, and flows with >5% dots (i.e. three successive noise values) P before the last postitive flow. The percentage can be given as a parameter. C  2.2.1.3 The mixed@ filter discards sequences with more than 70% positive flows.  Also, discard with  30%noise,20% middle (0.45..0.75) or <30% positive. DTDiscard a read if the number of untrimmed flows is less than n (n=186 for Titanium) E 02.2.1.4 Signal intensity trim - trim back until < 3% borderline flows (0.5..0.7). D Then trim borderline values or dots from the end (use a window). G 2.2.1.5 Primer filter Q This looks for the B-adaptor at the end of the read. The 454 implementation isn't very ( effective at finding mutated adaptors. I Z2.2.1.7 Quality score trimming trims using a 10-base window until a Q20 average is found. KDList length as a double (eliminates many instances of fromIntegral) LCalculate average of a list MZTranslate a number of flows to position in sequence, and update clipping data accordingly N:Update clip_qual_right if more severe than previous value >?@ABCDEFGHIJKLMNOPQRSTU>?@ABCDEFGHIJKLMNOPQRSTU?@ABCD>EFGHIJKLMNOUTSRQP>?@ABCDEFGHIJKLMNOPQRSTU„       !"#$%&'(()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijaklmnopqrbsttuvwxyz{|}~€‚biosff-0.3.7.1Bio.Sequence.SFFBio.Sequence.SFF_filtersBio.Sequence.SFF_name biocore-0.3.1Bio.Core.SequenceSeqDataQualQualDataReadNamedatetimeregionx_locy_locdecodeReadNameencodeReadName ReadBlock read_header flow_data flow_indexbasesquality ReadHeader name_length num_basesclip_qual_leftclip_qual_rightclip_adapter_leftclip_adapter_right read_name CommonHeader index_offset index_length num_reads key_length flow_length flowgram_fmtflowkeySFFIndexFlowreadSFF trimFlows trimFromTotrim trimAdaptertrimKey flowToBasePos baseToFlowPos recoverSFFwriteSFF writeSFF'testconvertgetRBputRB unpackFlows packFlowsflowgram masked_basescumulative_index TrimFilter DiscardFilter discard_empty discard_key discard_dots discard_mixeddiscard_length trim_sigintsigint trim_primer find_primer trim_qual20qual20dlengthavg clipFlowsclipSeq flx_linker ti_linker rna_adapter rna_adapter2 rna_adapter3 rapid_adapter ti_adapter_bdecodeLocation decodeDateencodeLocation encodeRegion encodeDatedivModsdecode36decChencode36b36PartialReadHeaderRSFFmagicversions writeReadspadskip getSaneHeader decodeSaneH $fBinaryRBI_pread_header_lenght _pname_length _pnum_bases_pclip_qual_left_pclip_qual_right_clip_adapter_left_pclip_adapter_right _pread_name unRecoveredRBI getBlocksgetBlock$fBinaryPartialReadHeader $fBinaryRSFF$fShowReadBlock$fBinaryReadHeader$fShowReadHeader$fBinaryCommonHeader$fShowCommonHeader $fBinarySFF $fShowSFF$fBioSeqQualReadBlock$fBioSeqReadBlock