i^      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ None 0Classification names (taxonomic classification) :Infernal bit score. Behaves like a double (deriving Num). @Infernal users guide, p.42: log-odds score in log_2 (aka bits). S = log_2 (P(seq|CM) / P(seq|null)) 'Species have an accession number, too. =Tag as being an Rfam model. Used for Stockholm and CM files. Tag as being a Pfam model. Tag as being a clan. JOne word name for the family or clan. Phantom-typed with the correct type - of model. Can be a longer name for species. IAccession number, in the format of RFxxxxx, PFxxxxx, or CLxxxxx. We keep L only the Int-part. A phantom type specifies which kind of accession number 8 this is. For Species, we just have an index, it seems. BGiven a null model and a probability, calculate the corresponding  . Given a null model and a ' return the corresponding probability.        None(The nodes in an HMM. Starting with Node 0 for BEGIN. *Negated natural logarithm of probability. TODO put into types stuff "The HMM3 data structure in `` slow mode''.  TODO shouldn' t this be Identification Pfam ? TODO maybe redo the whole idd idea and just keep the string? 2 !"#$%&'()*+,-./0123456789:;<=>?@ABC2 !"#$%&'()*+,-./0123456789:;<=>?@ABC2! 5432"#$%&'()*+,-./01CBA@?>=<;:9876 !"#$%&'()*+,-./0123456789:;<=>?@ABCNone DJTODO not everything is currently being parsed. Notably the rf,cs,alignmap  annotations. E(Check, if we have a legal HMMER3 model. FRead boolean flags. G/Determine which alphabet is in use by the HMM. H)Read from a bytestring into a structure. I!create associative map of the key/ value data. JParse the two beginning lines. K=Parse all individual nodes, except the first one, which uses J. L&Read a HMMER negated log-probability. MRead the optional COMPO line. N(Read the alphabet and transition lines. O"All the header lines until we see HMM. P"Simple test for the HMMer parser. DEFGHIJKLMNOP DEFGHIJKLMNOP DEFGHIJKLMNOP DEFGHIJKLMNOPNone QRSTUVWXYZ QRSTUVWXYZ QRSTUVZYXWQRSTUVWXYZNone[\]^_[\]^_[\]^_[\]^_None`HFor each species, we store the name and a classification list from most J general (head) to most specific (last). The database comes with the NCBI  taxon identifier (taxid). hGiven a name such as Drosophila Melanogaster , returns d.melanogaster. `abcdefgh `abcdefgh `abcdgfeh`abcdefghNoneijklmijklmijklmijklmNonenICertain states (IL,IR,ML,MR) emit a single nucleotide, one state emits a ' pair (MP), other states emit nothing. t State IDs wEncode CM state types.  Node IDs Encode CM node types. $Encode the CM versions we can parse A single state. The ID of this state %to which node does this state belong node type for this state type of the state #which transitions, id and bitscore do we emit characters BThis is an Infernal covariance model. We have a number of blocks: C basic information like the name of the CM, accession number, etc. J advanced information: nodes and their states, and the states themselves. & unsorted information from the header / blasic block The C data structure is not suitable for high-performance applications. K score inequalities: trusted (lowest seed score) >= gathering (lowest full " score) >= noise (random strings) Local entries into the CM. The  localBegin lens returns a map of state id's. We either have just the  root node (with the S3 state), or a set of states with type: MP,ML,MR,B. The localEnd; lens on the other hand is the set of possible early exits  from the model. name of model as in tRNA RFxxxxx identification %We can parse version 1.0 and 1.1 CMs  lowest score of any seed member all scores at or above  score are in the full alignment %highest score NOT included as member -Null-model: categorical distribution on ACGU each node has a set of states Deach state has a type, some emit characters, and some have children Entries into the CM. Exits out of the CM. Lall lines that are not handled. Multiline entries are key->multi-line entry 2Map of model accession numbers to individual CMs. &Map of model names to individual CMs. Make a CM have local start/end behaviour, with pbegin and pend  probabilities given. HInsert all legal local beginnings, disable root node (and root states).  The pbegin= probability the the total probability for local begins. The  remaining 1-pbegin* is the probability to start with node 1. Insert all legal local ends. Wnopqrstuvwxyz{|}~Unopqrstuvwxyz{|}~Uw~}|{zyxtuvnrposq%nrposqtuvw ~}|{zyx NoneDTop-level parser for Infernal 1.0 and 1.1 human-readable covariance 7 models. Reads all lines first, then builds up the CM. 6Infernal 1.0 header parser. Greps all lines until the MODEL: line, then I return lines to top-level parser. Parses three lines at once in case of  FT- lines. $Determine if a line is a node line ( ). If yes, we'll get the node . type as string and the node identifier, too.        !"#$%&'())*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWWXYZ[\]^_`abcdeefghijkl`mnodpqrstuvvwxyz{|}~gf*7ikEI N Q d VBiobaseInfernal-0.7.0.1Biobase.SElab.TypesBiobase.SElab.HMMBiobase.SElab.HMM.ImportBiobase.SElab.RfamNamesBiobase.SElab.RfamNames.ImportBiobase.SElab.TaxonomyBiobase.SElab.Taxonomy.ImportBiobase.SElab.CMBiobase.SElab.CM.ImportClassificationunClassificationBitScore unBitScoreSpeciesRfamPfamClanIdentificationIDDunIDD AccessionACCunACC prob2Score score2ProbNode_nid_matchE_insertE_trans NegLogProbNLPAlphabetCustomDiceCoinsRNADNAAminoHMMHMM3_version_idd_acc _description_leng_alph_rf_cs _alignMap_date_symAlph _transHeaders_compo_nodesinsertEmatchEnidtransaccalignMapalphcompocsdate descriptioniddlengnodesrfsymAlph transHeadersversion parseHMM3legalHMM readBooleanreadAlphreadBS headerMap parseBegin parseNodesreadNLP compoLine sathLines headerLinestest ModelNames_modelAC_modelID _speciesAC _speciesIDmodelACmodelID speciesAC speciesIDparse mkRfamNamemapIdRfamNamesmapAcRfamNamesfromFileTaxonomy _accession_name_classification accessionclassificationname shortenName mkTaxonomy mapIdTaxonomy mapAcTaxonomyEmits EmitNothing EmitsPair_pair EmitsSingle_singleStateID unStateID StateTypeELBESIRILMRMLMPDNodeIDunNodeIDNodeTypeENDROOTBEGRBEGLMATRMATLMATPBIF CMVersion Infernal11 Infernal10 illegalStateState_stateID_nodeID _nodeType _stateType _transitions_emitspairsingleCM_trustedCutoff _gathering _noiseCutoff _nullModel_states _localBegin _localEnd _unsorted_hmmemitsnodeIDnodeTypestateID stateType transitionsAC2CMID2CM gatheringhmm localBeginlocalEnd noiseCutoff nullModelstates trustedCutoffunsorted makeLocalmakeLocalBegin makeLocalEnd parseHeader lineParser parseCM1x readBitScore readAccession parseHeadersfinishedHeader parseStates parseStateisNode $fExtShape:. $fShape:.base Data.MaybeJust