hyraxAbif-0.2.3.15: Modules for parsing, generating and manipulating AB1 files.

Copyright(c) HyraxBio 2018
LicenseBSD3
Maintainerandre@hyraxbio.co.za, andre@andrevdm.com
Safe HaskellSafe
LanguageHaskell2010

Hyrax.Abif.Generate

Description

Functionality for generating AB1 files from an input FASTA. These AB1s are supported by both PHRED and recall, if you are using other software you may need to add additional required sections.

Weighted reads

The input FASTA files have "weighted" reads. The name for each read is an value between 0 and 1 which specifies the height of the peak relative to a full peak.

Single read

The most simple example is a single FASTA with a single read with a weight of 1

> 1
ACTG

The chromatogram for this AB1 shows perfect traces for the input ACTG nucleotides with a full height peak.

Mixes & multiple reads

The source FASTA can have multiple reads, which results in a chromatogram with mixes

> 1
ACAG
> 0.3
ACTG

There is an AT mix at the third nucleotide. The first read has a weight of 1 and the second a weight of 0.3. The chromatogram shows the mix and the T with a lower peak (30% of the A peak)

Summing weights

  • The weigh of a read specifies the intensity of the peak from 0 to 1.
  • Weights for each position are added to a maximum of 1 per nucleotide
  • You can use `_` as a "blank" nucleotide, in which only the nucleotides from other reads will be considered

E.g.

> 1
ACAG
> 0.3
_GT
> 0.2
_G

Reverse reads

A weighted FASTA can represent a reverse read. To do this add a R suffix to the weight. The data you enter should be entered as if it was a forward read. This data will be complemented and reversed before writing to the ABIF

E.g.

> 1R
ACAG

See README.md for additional details and examples

Synopsis

Documentation

generateAb1s :: FilePath -> FilePath -> IO () Source #

Generate a set of AB1s. One for every FASTA found in the source directory

generateAb1 :: (Text, [(Double, Text)]) -> ByteString Source #

Create the ByteString data for an AB1 given the data from a weighted FASTA (see readWeightedFasta)

readWeightedFasta :: ByteString -> Either Text [(Double, Text)] Source #

Read a weighted FASTA file. See the module documentation for details on the format of the weighted FASTA Reads with a weight followed by an R are reverse reads, and the AB1 generated will contain the complemeted sequence.

e.g. weighted FASTA

> 1
ACAG
> 0.3
_GT
> 0.2
_G

The result data has the type

  [(Double, Text)]
    ^        ^
    |        |
    |        +---- read 
    | 
    +---- weight

iupac :: [[Char]] -> [Char] Source #

Given a set of nucleotides get the IUPAC ambiguity code

unIupac :: Char -> [Char] Source #

Convert a IUPAC ambiguity code to the set of nucleotides it represents

complementNucleotides :: Text -> Text Source #

Return the complement of a nucelotide string