TODO: Check out options for freqtable data structure. Things to try out: fmindex/afi, hashtable, accumArray, HsJudy And perhaps combining key and count into an Int(|eger|64)? ---------------------------------------- TODO: support shaped keys TODO: Repeat ID: output report (.tbl, .out) * calculate 1..k'th order entropy * other? TODO: Mask against library TODO: three-pass: build FT, build library, mask against it TODO: calculate distrib and mask over windows (w=200? 400?) avoids different treatment of different length sequences - Clustering - Clustering with (SG/Lee) assembly -> statistics to use when clustering: 1. mode of word counts distribution (= coverage) 2. estimated p value (1-var/mu) (= avg. overlap)