úÎ$'|e      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdNoneI#  !""  !# !  "    !"NoneV Faster nubW&Extract sequence from a fasta sequenceX Sort a tupleY'Group together fasta entires by a fieldeGet a field from a fasta headerZCut off the ends of a sequence[ Sum up a mapVWXYeZ[VWXYZ[WXVYZ[VWXYeZ[None$f1Like hamming, but similarity rather than distance\qTakes in an identity and fasta sequences and returns the sequences grouped together by hamming distance identitygMKeep comparing clusters until no more fusions (no change in size) make sense]eGroup together by all pairings rather than adjacent. Altered from lyxia's original to use sequences.hCHelper to groupBy'. Altered from lyxia's original to use sequences.i'Either the sequences are similar or notj&Get the identity between two sequencesf\g]hij\]\]f\g]hijNone$kfZip positions into fasta sequences with a certain size (number of sequences in a cluster) of AA pairsl2Zip positions into sequences to get position pairs^FGet the frequencies of amino acid pairs for each position in a cluster_NFilter gaps out of the map. If no gaps are wanted, remove the entire position`tGet the frequency matrix from a list of frequency maps from clusters. We no longer care about positions after this.a=Join together all frequency maps into a single frequency map.b&Get the blosum matrix of each AA entry kl^_mn`ab^_`ab_^`ab kl^_mn`abNonec&Print the BLOSUM matrix as a dataframedDPrint the BLOSUM matrix as a csv matrix according to a certain ordercdcdcdcdo        !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh%blosum-0.1.1.2-5Y0JpOHwuQWIC0UFeY9EeFTypesUtilityClusterMatrixPrintBlosumunBlosum FrequencyMapunFrequencyMapBlockMap unBlockMapAAMapunAAMapClusterFrequencyMapunClusterFrequencyMap ClusterMap unClusterMap BlosumVal unBlosumValIdentityPosition FrequencyFieldNucAAunAA $fMonoidAAMap$fEqAA$fOrdAA$fShowAA$fReadAA$fEqNuc$fOrdNuc $fShowNuc $fReadNuc $fEqField $fEqFrequency$fOrdFrequency$fNumFrequency$fEnumFrequency$fShowFrequency$fReadFrequency$fFractionalFrequency $fEqPosition $fOrdPosition $fNumPosition$fEnumPosition$fShowPosition$fReadPosition $fEqIdentity $fOrdIdentity $fNumIdentity$fShowIdentity$fReadIdentity $fEqBlosumVal$fOrdBlosumVal$fNumBlosumVal$fEnumBlosumVal$fShowBlosumVal$fReadBlosumVal$fEqClusterMap$fOrdClusterMap$fShowClusterMap$fEqClusterFrequencyMap$fOrdClusterFrequencyMap$fShowClusterFrequencyMap $fEqAAMap $fOrdAAMap $fShowAAMap $fEqBlockMap $fOrdBlockMap$fShowBlockMap$fEqFrequencyMap$fOrdFrequencyMap$fShowFrequencyMap $fEqBlosum $fOrdBlosum $fShowBlosumnub'getSeq sortTuple groupBlockscutEndssumMapgetClusterIdentitygroupBy'getClusterFrequencyMap removeChar getBlockMap joinBlockMaps getBlosum printBlosumprintBlosumCSVgetField negHammingclusterIdentityGoeqTo compareSeqs getIdentityzipSize zipPosition collectPairstoAAMap