úÎ!q´gH˜      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—None7Ô RNAlien,Data structure for RNAcentral entry response˜RNAlien Send query and parse return XML ™RNAlien"Send query and return response XMLRNAlien4Function for querying the RNAcentral REST interface.šRNAlienrFunction for delayed queries to the RNAcentral REST interface. Enforces the maximum 20 requests per second policy.RNAlien#Build a query from a input sequenceATODO [chzs] consider using strict bytestring as long as possible.CTODO [chzs] consider giving useful typelevel names to the types in FastaW. One may give a type-level name to the sequence identifier, and an identifier (like DNA) to the biosequence type.    NoneRRNAlien0RNAlien?RNAlienNRNAlien!Keeps track of model constructionZRNAlienStatic construction optionsM+*)('&%$#"! ,-/.01>=<;:98765432?@DCBAEFIHGJKMLNOYXWVUTSRQPZ[jihfedcba`_^]\gMZ[jihfedcba`_^]\gNOYXWVUTSRQPJKMLEFIHG?@DCBA01>=<;:98765432,-/.+*)('&%$#"! NoneZyRNAlien'parse from input filePath zRNAlien'parse from input filePath {RNAlien/parse from input filePath |RNAlien/parse from input filePath ›RNAlienVParsing function for CMSearches with multiple querymodels in one modelfile, e.g. clans}RNAlien'parse from input filePath ~RNAlien/parse from input filePath S !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[g\]^_`abcdefhijyz{|}~{|yz}~NoneSXe ,RNAlien\Initial RNA family model construction - generates iteration number, seed alignment and modelœRNAlienHReevaluate collected potential members for inclusion in the result modelRNAlienComputes size of blast db in MbžRNAlien|Replaces structure of input stockholm file with the consensus structure of alifoldFilepath and outputs updated stockholmfileŸRNAlien)Used for passing progress to Alien server‚RNAlien)Used for passing progress to Alien server RNAlien}Filter duplicates removes hits in sequences that were already collected. This happens during revisiting the starting subtree.¡RNAlienÿûFilter a list of similar extended blast hits filterIdenticalSequencesWithOrigin :: [(Fasta,Int,String,Char)] -> Double -> [(Fasta,Int,String,Char)] filterIdenticalSequencesWithOrigin (headSequence:rest) identitycutoff = result where filteredSequences = filter (x -> (sequenceIdentity (firstOfQuadruple headSequence) (firstOfQuadruple x)) < identitycutoff) rest result = headSequence:(filterIdenticalSequencesWithOrigin filteredSequences identitycutoff) filterIdenticalSequencesWithOrigin [] _ = [],Filter a list of similar extended blast hits¢RNAlien9Filter sequences too similar to already aligned sequences£RNAlien'Filter alignment entries by similiarity¤RNAlienJCheck if the result field of BlastResult is filled and if hits are present¥RNAlienCompute identity of sequences¦RNAlienÿQCompute identity of sequences stringIdentity :: String -> String -> Double stringIdentity string1 string2 = identityPercent where distance = ED.levenshteinDistance costs string1 string2 --Replication of RNAz select sequences requires only allowing substitutions costs = ED.defaultEditCosts {ED.deletionCosts = ED.ConstantCost 100,ED.insertionCosts = ED.ConstantCost 100,ED.transpositionCosts = ED.ConstantCost 100} maximumDistance = maximum [length string1,length string2] identityPercent = 1 - (fromIntegral distance/fromIntegral maximumDistance)Compute identity of sequencesƒRNAlien]Partitions sequences by containing a cmsearch hit and extracts the hit region as new sequence„RNAlienPExtract a substring with coordinates from cmsearch, first nucleotide has index 1§RNAlien&Adds cm prefix to pseudo random number…RNAlienCreate session id for RNAlien¨RNAlienPRun external locarna command and read the output into the corresponding datatype©RNAlien™Run external mlocarna command and read the output into the corresponding datatype, there is also a folder created at the location of the input fasta fileªRNAlienÖRun external mlocarna command and read the output into the corresponding datatype, there is also a folder created at the location of the input fasta file, the job is terminated after the timeout provided in seconds«RNAlien5Run external clustalo command and return the Exitcode¬RNAlien5Run external clustalo command and return the Exitcode­RNAlienPRun external CMbuild command and read the output into the corresponding datatype®RNAlienARun CMCompare and read the output into the corresponding datatype†RNAlien Run CMsearch¯RNAlien Run CMstat°RNAlien#Run CMcalibrate and return exitcode±RNAlien#Run CMcalibrate and return exitcode²RNAlien.Hits should have a compareable length to query³RNAlien.Hits should have a compareable length to query´RNAlienJWrapper for retrieveFullSequence that rerequests incomplete return sequeesµRNAlien9NCBI uses the e-Value of the best HSP as the Hits e-Value¶RNAlienHWrapper functions that ensures that only 20 queries are sent per request·RNAlien"Extract taxids from JSON2 blasthit¸RNAlienÿWrapper functions that ensures that only 20 queries are sent per request retrieveBlastHitsTaxIdEntrez :: [J.Hit] -> IO [([J.Hit],String)] retrieveBlastHitsTaxIdEntrez blastHits = do let splits = portionListElements blastHits 20 mapM retrieveBlastHitTaxIdEntrez splitsRNAlien*Call for external preprocessClustalForRNAz‘RNAlienŽCall for external preprocessClustalForRNAcode - RNAcode additionally to RNAz requirements does not accept pipe,underscore, doublepoint symbols¹RNAlien*Sequence preselection for RNAz and RNAcode“RNAlien"Check if alien can connect to NCBIºRNAlien5Blast evalue is set stricter in inital alignment mode»RNAlienRun external blast command ¼RNAlienRetrieve taxids for blast —RNAlien`RNAlienScan RNA family model construction - generates iteration number, seed alignment and model½RNAlienJWrapper for retrieveFullSequence that rerequests incomplete return sequeesk !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[g\]^_`abcdefhijy{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—…ˆ‰—Œ‚‹Š†{|‡y„€Ž~}“’‘”•ƒ–¾       !!"#$%&'()*+,-../01123456789:;<=>??@ABCDDEFGHHIJKKLMNOPQRSTUVVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹$RNAlien-1.7.0-EDE2ba4eKAmGZ4EoPSBpShBiobase.RNAlien.RNAcentralHTTPBiobase.RNAlien.TypesBiobase.RNAlien.InfernalParserBiobase.RNAlien.LibraryRNAcentralEntryurl rnacentral_idmd5sequencelengthxrefs publicationsRNAcentralEntryResponsecountnextpreviousresultsrnaCentralHTTPgetRNACentralEntriesbuildSequenceViaMD5QuerybuildStringViaMD5QueryshowRNAcentralAlienEvaluation$fFromJSONRNAcentralEntry$fToJSONRNAcentralEntry!$fFromJSONRNAcentralEntryResponse$fToJSONRNAcentralEntryResponse$fShowRNAcentralEntry$fEqRNAcentralEntry$fGenericRNAcentralEntry$fShowRNAcentralEntryResponse$fEqRNAcentralEntryResponse $fGenericRNAcentralEntryResponseCMstat statIndexstatName statAccessionstatSequenceNumberstatEffectiveSequencesstatConsensusLengthstatW statBasepairsstatBifurcations statModelrelativeEntropyCMrelativeEntropyHMM SearchResult candidatesblastDatabaseSize CMsearchHithitRankhitSignificance hitEvaluehitScorehitBiashitSequenceHeaderhitStarthitEnd hitStrandhitModel hitTruncation hitGCContenthitDescriptionCMsearch queryCMfiletargetSequenceDatabasenumberOfWorkerThreads cmsearchHitsSequenceRecordnucleotideSequencealignedrecordDescriptionTaxonomyRecordrecordTaxonomyIdsequenceRecordsModelConstructioniterationNumber inputFasta taxRecordsupperTaxonomyLimittaxonomicContextevalueThresholdalignmentModeInfernalselectedQueriespotentialMembers genomeFastas StaticOptions tempDirPath sessionID nSCICutoff userTaxIdsingleHitperTaxTogglequerySelectionMethod queryNumberlengthFilterTogglecoverageFilterToggleblastSoftmaskingToggle cpuThreads blastDatabasetaxRestrictionverbositySwitchoffline$fShowSequenceRecord$fShowTaxonomyRecord$fShowSearchResult$fShowModelConstruction $fShowCMstat$fShowStaticOptions$fShowCMsearchHit$fEqCMsearchHit$fReadCMsearchHit$fShowCMsearch $fEqCMsearch$fReadCMsearch $fEqCMstat $fReadCMstat parseCMSearchparseCMSearches readCMSearchreadCMSearches parseCMstat readCMstatmodelConstructersetInitialTaxIdwriteFastaFile resultSummaryevaluePartitionTrimCMsearchHitscmSearchsubStringcreateSessionIDsystemCMsearch compareCM logMessage logEither checkToolslogToolVersions constructTaxonomyRecordsCSVTable setVerboseevaluateConstructionResultrnaZEvalOutput preprocessClustalForRNAzExternal#preprocessClustalForRNAcodeExternalpreprocessClustalForRNAzcheckNCBIConnection reformatFastacheckTaxonomyRestriction readFastaFilescanModelConstructer startSession sendQuerydelayedRNACentralHTTPgenParserMultipleCMSearchreevaluatePotentialMemberscomputeDataBaseSizereplaceStockholmStructureiterationSummaryfilterDuplicatesfilterIdenticalSequencesfilterWithCollectedSequencesfilterIdenticalSequences'blastMatchesPresent textIdentitysequenceIdentityrandomid systemlocarnasystemMlocarnasystemMlocarnaWithTimeoutsystemClustalw2systemClustalo systemCMbuildsystemCMcompare systemCMstatsystemCMcalibrate systemCMalignhitLengthCheck coverageCheckretrieveFullSequences hitEValueretrieveParentTaxIdsEntrezextractBlastHitsTaxIdretrieveBlastHitTaxIdEntrezrnaCodeSelectSeqs2setBlastExpectThreshold systemBlastsystemGetSpeciesTaxIdretrieveGenomeFullSequences