Readme for dephd-0.0
Synopsis
--------
dephd - A simple tool for base calling and quality appraisal
Reads files in phd-format (phred output), either specified individually,
or in a directory (use the --dir option to read directories).
Installation
------------
You need the GHC compiler, or if you know what you are doing, another
Haskell compiler or interpreter with Cabal. You also need to install
the 'bio' library (darcs get http://malde.org/~ketil/bio)
With those things in place, you should be able to do
runhaskell Setup configure
runhaskell Setup build
sudo runhaskell Setup install
Optionally, add "--prefix $HOME" (without the quotes) after configure
to install to your home directory - in which case you don't need the 'sudo'.
Usage
-----
dephd --rank files..
dephd --rank --dir dirs..
Outputs (to standard output) a summary of all phd files, including
sequence name, average quality, length of longest contiguous run with
qualities >= 15, 20 and 30, and longest run with sliding average
quality 20 or better.
dephd --call files..
dephd --call --dir dirs..
Produce files 'dephd.fasta' and 'dephd.qual' in the current directory,
containing sequence and quality data, respectively. Bases estimated
(currently very conservatively) to be of good quality are in upper
case, very low quality is output as lower case 'n's.
dephd --plot files..
dephd --plot --dir dirs..
dephd --plot -X files..
dephd --plot -X --dir dirs..
Produce a plot of sequence quality, along with a sliding average.
With -X display it directory, without -X, produce a jpg file with the
plot. A similar option --plotall generates and displays all plots
directly, instead of one at a time (only useful with -X as well).
Bugs
----
Not many, I hope. Specifying more than one action at a time will pull
all sequences into memory, but a single action should stream okay.
Approx 15K phd-files can be --call'ed OR --plot'ed OR --rank'ed with
less than 100Mb of RAM.
For further questions, email me at <ketil@malde.org>