!ndd"      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~None7 RNAlien,Data structure for RNAcentral entry responseRNAlien Send query and parse return XML RNAlien"Send query and return response XMLRNAlien4Function for querying the RNAcentral REST interface.RNAlienrFunction for delayed queries to the RNAcentral REST interface. Enforces the maximum 20 requests per second policy.RNAlien#Build a query from a input sequenceATODO [chzs] consider using strict bytestring as long as possible.CTODO [chzs] consider giving useful typelevel names to the types in FastaW. One may give a type-level name to the sequence identifier, and an identifier (like DNA) to the biosequence type.    None8RNAlien0RNAlien?RNAlienNRNAlien!Keeps track of model constructionYRNAlienStatic construction optionsL+*)('&%$#"! ,-/.01>=<;:98765432?@DCBAEFIHGJKMLNOXWVUTSRQPYZihgedcba`_^]\[fLYZihgedcba`_^]\[fNOXWVUTSRQPJKMLEFIHG?@DCBA01>=<;:98765432,-/.+*)('&%$#"! None8xRNAlien'parse from input filePath yRNAlien'parse from input filePath zRNAlien/parse from input filePath {RNAlien/parse from input filePath RNAlienVParsing function for CMSearches with multiple querymodels in one modelfile, e.g. clans|RNAlien'parse from input filePath }RNAlien/parse from input filePath R !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZf[\]^_`abcdeghixyz{|}z{xy|}NoneSXb*~RNAlien\Initial RNA family model construction - generates iteration number, seed alignment and modelRNAlienHReevaluate collected potential members for inclusion in the result modelRNAlienComputes size of blast db in MbRNAlien|Replaces structure of input stockholm file with the consensus structure of alifoldFilepath and outputs updated stockholmfileRNAlien)Used for passing progress to Alien serverRNAlien)Used for passing progress to Alien serverRNAlien}Filter duplicates removes hits in sequences that were already collected. This happens during revisiting the starting subtree.RNAlienFilter a list of similar extended blast hits filterIdenticalSequencesWithOrigin :: [(Fasta,Int,String,Char)] -> Double -> [(Fasta,Int,String,Char)] filterIdenticalSequencesWithOrigin (headSequence:rest) identitycutoff = result where filteredSequences = filter (x -> (sequenceIdentity (firstOfQuadruple headSequence) (firstOfQuadruple x)) < identitycutoff) rest result = headSequence:(filterIdenticalSequencesWithOrigin filteredSequences identitycutoff) filterIdenticalSequencesWithOrigin [] _ = [],Filter a list of similar extended blast hitsRNAlien9Filter sequences too similar to already aligned sequencesRNAlien'Filter alignment entries by similiarityRNAlienJCheck if the result field of BlastResult is filled and if hits are presentRNAlienCompute identity of sequencesRNAlienQCompute identity of sequences stringIdentity :: String -> String -> Double stringIdentity string1 string2 = identityPercent where distance = ED.levenshteinDistance costs string1 string2 --Replication of RNAz select sequences requires only allowing substitutions costs = ED.defaultEditCosts {ED.deletionCosts = ED.ConstantCost 100,ED.insertionCosts = ED.ConstantCost 100,ED.transpositionCosts = ED.ConstantCost 100} maximumDistance = maximum [length string1,length string2] identityPercent = 1 - (fromIntegral distance/fromIntegral maximumDistance)Compute identity of sequencesRNAlien]Partitions sequences by containing a cmsearch hit and extracts the hit region as new sequenceRNAlienPExtract a substring with coordinates from cmsearch, first nucleotide has index 1RNAlien&Adds cm prefix to pseudo random numberRNAlienCreate session id for RNAlienRNAlienPRun external locarna command and read the output into the corresponding datatypeRNAlienRun external mlocarna command and read the output into the corresponding datatype, there is also a folder created at the location of the input fasta fileRNAlienRun external mlocarna command and read the output into the corresponding datatype, there is also a folder created at the location of the input fasta file, the job is terminated after the timeout provided in secondsRNAlien5Run external clustalo command and return the ExitcodeRNAlien5Run external clustalo command and return the ExitcodeRNAlienPRun external CMbuild command and read the output into the corresponding datatypeRNAlienARun CMCompare and read the output into the corresponding datatypeRNAlien Run CMsearchRNAlien Run CMstatRNAlien#Run CMcalibrate and return exitcodeRNAlien#Run CMcalibrate and return exitcodeRNAlien.Hits should have a compareable length to queryRNAlien.Hits should have a compareable length to queryRNAlienJWrapper for retrieveFullSequence that rerequests incomplete return sequeesRNAlien9NCBI uses the e-Value of the best HSP as the Hits e-ValueRNAlienHWrapper functions that ensures that only 20 queries are sent per requestRNAlien"Extract taxids from JSON2 blasthitRNAlienWrapper functions that ensures that only 20 queries are sent per request retrieveBlastHitsTaxIdEntrez :: [J.Hit] -> IO [([J.Hit],String)] retrieveBlastHitsTaxIdEntrez blastHits = do let splits = portionListElements blastHits 20 mapM retrieveBlastHitTaxIdEntrez splitsRNAlien*Call for external preprocessClustalForRNAzRNAlienCall for external preprocessClustalForRNAcode - RNAcode additionally to RNAz requirements does not accept pipe,underscore, doublepoint symbolsRNAlien*Sequence preselection for RNAz and RNAcodeRNAlien"Check if alien can connect to NCBIRNAlien5Blast evalue is set stricter in inital alignment modeRNAlienRun external blast command RNAlienRetrieve taxids for blast i !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZf[\]^_`abcdeghixz{|}~~z{x}|       !!"#$%&'()*+,-../01123456789:;<=>??@ABCDDEFGHHIJKKLMNOPQRSTUUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~$RNAlien-1.6.0-EaDMQmhh3aCAzgb3dODI60Biobase.RNAlien.RNAcentralHTTPBiobase.RNAlien.TypesBiobase.RNAlien.InfernalParserBiobase.RNAlien.LibraryRNAcentralEntryurl rnacentral_idmd5sequencelengthxrefs publicationsRNAcentralEntryResponsecountnextpreviousresultsrnaCentralHTTPgetRNACentralEntriesbuildSequenceViaMD5QuerybuildStringViaMD5QueryshowRNAcentralAlienEvaluation$fFromJSONRNAcentralEntry$fToJSONRNAcentralEntry!$fFromJSONRNAcentralEntryResponse$fToJSONRNAcentralEntryResponse$fShowRNAcentralEntry$fEqRNAcentralEntry$fGenericRNAcentralEntry$fShowRNAcentralEntryResponse$fEqRNAcentralEntryResponse $fGenericRNAcentralEntryResponseCMstat statIndexstatName statAccessionstatSequenceNumberstatEffectiveSequencesstatConsensusLengthstatW statBasepairsstatBifurcations statModelrelativeEntropyCMrelativeEntropyHMM SearchResult candidatesblastDatabaseSize CMsearchHithitRankhitSignificance hitEvaluehitScorehitBiashitSequenceHeaderhitStarthitEnd hitStrandhitModel hitTruncation hitGCContenthitDescriptionCMsearch queryCMfiletargetSequenceDatabasenumberOfWorkerThreads cmsearchHitsSequenceRecordnucleotideSequencealignedrecordDescriptionTaxonomyRecordrecordTaxonomyIdsequenceRecordsModelConstructioniterationNumber inputFasta taxRecordsupperTaxonomyLimittaxonomicContextevalueThresholdalignmentModeInfernalselectedQueriespotentialMembers StaticOptions tempDirPath sessionID nSCICutoff userTaxIdsingleHitperTaxTogglequerySelectionMethod queryNumberlengthFilterTogglecoverageFilterToggleblastSoftmaskingToggle cpuThreads blastDatabasetaxRestrictionverbositySwitchoffline$fShowSequenceRecord$fShowTaxonomyRecord$fShowSearchResult$fShowModelConstruction $fShowCMstat$fShowStaticOptions$fShowCMsearchHit$fEqCMsearchHit$fReadCMsearchHit$fShowCMsearch $fEqCMsearch$fReadCMsearch $fEqCMstat $fReadCMstat parseCMSearchparseCMSearches readCMSearchreadCMSearches parseCMstat readCMstatmodelConstructersetInitialTaxIdwriteFastaFile resultSummaryevaluePartitionTrimCMsearchHitscmSearchsubStringcreateSessionIDsystemCMsearch compareCM logMessage logEither checkToolslogToolVersions constructTaxonomyRecordsCSVTable setVerboseevaluateConstructionResultrnaZEvalOutput preprocessClustalForRNAzExternal#preprocessClustalForRNAcodeExternalpreprocessClustalForRNAzcheckNCBIConnection reformatFastacheckTaxonomyRestriction readFastaFile startSession sendQuerydelayedRNACentralHTTPgenParserMultipleCMSearchreevaluatePotentialMemberscomputeDataBaseSizereplaceStockholmStructureiterationSummaryfilterDuplicatesfilterIdenticalSequencesfilterWithCollectedSequencesfilterIdenticalSequences'blastMatchesPresent textIdentitysequenceIdentityrandomid systemlocarnasystemMlocarnasystemMlocarnaWithTimeoutsystemClustalw2systemClustalo systemCMbuildsystemCMcompare systemCMstatsystemCMcalibrate systemCMalignhitLengthCheck coverageCheckretrieveFullSequences hitEValueretrieveParentTaxIdsEntrezextractBlastHitsTaxIdretrieveBlastHitTaxIdEntrezrnaCodeSelectSeqs2setBlastExpectThreshold systemBlastsystemGetSpeciesTaxId