Readme for simseq-0.0
Simseq - SIMulate SEQuences. Yep, that's real creative.
Synopsis
--------
Generates a bunch of sequences from a set of reference sequences.
For ESTs, NCBI's refseq transcripts are probably good choices.
The generated sequences are generated using a model that specifies
priming conditions and error generation.
Currently, this is not very refined, you can try
simseq --model=sanger:n,d reference.fasta
Where n indicates the number of sequences to generate, starting points
drawn from a uniform distribution, and d probability of being in the
forward direction. Or, even more experimentally:
simseq --model=454:n,d
Which implemets a completely unfounded and baseless model of 454/Roche
pyrosequencing. (Okay, actually based on a paper by Marguiles et al, but
more data is definitely a requirement).
Solexa will be installed as soon as anybody says something definitive
about the error modes.
In any case, running out of sequence results in X's, indicating vector,
which I hope makes sense for Sanger, at least.
Install
-------
The usual Cabal routine. Get a working GHC compiler, install
my 'bio' library, and do:
chmod +x Setup.hs
./Setup.hs configure
./Setup.hs build
sudo ./Setup.hs install
Mail me if it didn't work - <ketil at malde.org>.