phybin-0.3: Utility for clustering phylogenetic trees in Newick format based on Robinson-Foulds distance.

Safe HaskellNone

Bio.Phylogeny.PhyBin.CoreTypes

Contents

Synopsis

Tree and tree decoration types

data NewickTree a Source

Even though the Newick format allows it, here we ignore interior node labels. (They are not commonly used.)

Note that these trees are rooted. The normalize function ensures that a single, canonical rooted representation is chosen.

Constructors

NTLeaf a !Label 
NTInterior a [NewickTree a] 

Instances

type DefDecor = (Maybe Int, BranchLen)Source

The barebones default decorator for NewickTrees contains BOOTSTRAP and BRANCHLENGTH. The bootstrap values, if present, will range in [0..100]

data StandardDecor Source

The standard decoration includes everything in DefDecor plus some extra cached data:

  1. branch length from parent to this node (2) bootstrap values for the node
  2. subtree weights for future use (defined as number of LEAVES, not counting intermediate nodes) (4) sorted lists of labels for symmetry breaking

Constructors

StandardDecor 

Fields

branchLen :: BranchLen
 
bootStrap :: Maybe Int
 
subtreeWeight :: Int
 
sortedLabels :: [Label]
 

type AnnotatedTree = NewickTree StandardDecorSource

Additionally includes some scratch data that is used by the binning algorithm.

data FullTree a Source

A common type of tree contains the standard decorator and also a table for restoring the human-readable node names.

Constructors

FullTree 

Instances

data ClustMode Source

Constructors

BinThem 
ClusterThem 

Fields

linkage :: Linkage
 

data NumTaxa Source

How many taxa should we expect in the incoming dataset?

Constructors

Expected Int

Supplied by the user. Committed.

Unknown

In the future we may automatically pick a behavior. Now this one is usually an error.

Variable

Explicitly ignore this setting in favor of comparing all trees (even if some are missing taxa). This only works with certain modes.

Tree operations

displayDefaultTree :: FullTree DefDecor -> DocSource

Display a tree WITH the bootstrap and branch lengths. This prints in NEWICK format.

displayStrippedTree :: FullTree a -> DocSource

The same, except with no bootstrap or branch lengths. Any tree annotations ignored.

treeSize :: NewickTree a -> IntSource

How many nodes (leaves and interior) are contained in a NewickTree?

numLeaves :: NewickTree a -> IntSource

This counts only leaf nodes, which should include all taxa.

map_labels :: (Label -> Label) -> NewickTree a -> NewickTree aSource

Apply a function to all the *labels* (leaf names) in a tree.

all_labels :: NewickTree t -> [Label]Source

Return all the labels contained in the tree.

foldIsomorphicTrees :: ([a] -> b) -> [NewickTree a] -> NewickTree bSource

This function allows one to collapse multiple trees while looking only at the horizontal slice of all the annotations *at a given position* in the tree.

Isomorphic must apply both to the shape and the name labels or it is an error to apply this function.

Utilities specific to StandardDecor:

avg_branchlen :: HasBranchLen a => [NewickTree a] -> DoubleSource

Average branch length across all branches in all all trees.

get_bootstraps :: NewickTree StandardDecor -> [Int]Source

Retrieve all the bootstraps values actually present in a tree.

Command line config options

data PhyBinConfig Source

Due to the number of configuration options for the driver, we pack them into a record.

Constructors

PBC 

Fields

verbose :: Bool
 
num_taxa :: NumTaxa
 
name_hack :: String -> String
 
output_dir :: String
 
inputs :: [String]
 
do_graph :: Bool
 
do_draw :: Bool
 
clust_mode :: ClustMode
 
highlights :: [FilePath]
 
show_trees_in_dendro :: Bool
 
show_interior_consensus :: Bool
 
rfmode :: WhichRFMode
 
preprune_labels :: Maybe [String]
 
print_rfmatrix :: Bool
 
dist_thresh :: Maybe Int
 
branch_collapse_thresh :: Maybe Double

Branches less than this length are collapsed.

bootstrap_collapse_thresh :: Maybe Int

BootStrap values less than this result in the intermediate node being collapsed.

default_phybin_config :: PhyBinConfigSource

The default phybin configuration.

data WhichRFMode Source

Supported modes for computing RFDistance.

Constructors

HashRF 
TolerantNaive 

General helpers

type Label = IntSource

Labels are inexpensive unique integers. The table is necessary for converting them back.

type LabelTable = Map Label StringSource

Map labels back onto meaningful names.

Experimenting with abstracting decoration operations

class HasBranchLen a whereSource

Methods

getBranchLen :: a -> BranchLenSource