RNAdesign: Multi-target RNA sequence design

[ bioinformatics, gpl, library, program ] [ Propose Tags ]

The RNA sequence design problem asks for a single sequence that readily folds into the (one or more) structural targets that are given as input.

This program expects on standard input a file with one or more structures and, possibly, additional sequence constraints in the form of an IUPAC string. It will then run a Markov chain to find a sequence that is optimal with regard to the structural targets and the user-defineable optimization function.

The user can give different optimization criteria on the command line, akin to a simple calculator.

For more details please consult: https://github.com/choener/RNAdesign/blob/master/README.md

You can also run RNAdesign --showmanual which will display the same README.md.

If you find this program useful, please cite:

Christian Hoener zu Siederdissen, Stefan Hammer, Ingrid Abfalter, Ivo L. Hofacker, Christoph Flamm, Peter F. Stadler
Computational design of RNAs with complex energy landscapes
2013. Biopolymers. 99, no. 12. 99. 1124–36.

http://dx.doi.org/10.1002/bip.22337


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.0.2.1, 0.1.0.0, 0.1.1.0, 0.1.2.1, 0.1.2.2
Change log changelog
Dependencies array (>=0.4), base (>=4 && <5), BiobaseTurner (>=0.3.1.1), BiobaseVienna (>=0.3), BiobaseXNA (>=0.8.1), bytestring (>=0.10), cmdargs (>=0.10 && <0.11), containers, fgl (>=5.4), fgl-extras-decompositions (>=0.1.0.0), file-embed (>=0.0.6), lens (>=3.9), monad-primitive (>=0.1), mwc-random-monad (>=0.6), parallel (>=3.2), parsec (>=3), ParsecTools (>=0.0.2 && <0.0.3), primitive (>=0.5), PrimitiveArray (>=0.5.3), random (>=1.0), RNAFold (>=1.99.3.3), transformers (>=0.3), tuple (>=0.2), vector (>=0.10), ViennaRNA-bindings (>=0.1.1.1) [details]
License GPL-3.0-only
Copyright Christian Hoener zu Siederdissen, 2013-2014
Author Christian Hoener zu Siederdissen
Maintainer choener@tbi.univie.ac.at
Category Bioinformatics
Uploaded by ChristianHoener at 2014-02-13T23:55:24Z
Distributions
Reverse Dependencies 1 direct, 0 indirect [details]
Executables RNAdesign
Downloads 4443 total (8 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Successful builds reported [all 1 reports]

Readme for RNAdesign-0.1.2.2

[back to package description]

RNAdesign

The RNAdesign program solves the multi-target RNA sequence design problem. You can give one or more structural targets for which a single compatible sequence is designed.

PAPER

Christian Hoener zu Siederdissen, Stefan Hammer, Ingrid Abfalter, Ivo L. Hofacker, Christoph Flamm, Peter F. Stadler. Computational Design of RNAs with Complex Energy Landscapes. 2013. Biopolymers. 99, no. 12. 99. 1124–36. http://dx.doi.org/10.1002/bip.22337.

Contact

choener@tbi.univie.ac.at

HOW TO USE RNAdesign

RNAdesign designs RNA sequences given one or more structural targets. The program offers a variety of optimization functions that each can be used to optimize candidate sequence towards a certain goal, say, minimal ensemble defect or small energetic distance to another target structure.

RNAdesign input

Structural targets are given via stdin, preferably via an input file. Below is a the small tri-stable from our paper, which you should then pipe to RNAdesign: "echo tri-stable.dat | RNAdesign"

"cat tri-stable.dat:"

a tri-stable example target. (optional comment)

((((....))))....((((....))))........ ........((((....((((....))))....)))) ((((((((....))))((((....))))....))))

below follows a simple (and optional) sequence constraint.

CKNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNB

The input may contain many comments lines, starting with a hash "#" and at most one sequence constraint line. All of these lines are optional, except of course for the structural constraints.

Optimization functions

Depending on the actual design you are looking for, you'll want to modify the optimization function. Below, the different options available are detailed. By giving a complex "--optfun", many different design goals can be tried.

A good optimization goal is (as an example for three targets):

--optfun "eos(1)+eos(2)+eos(3) - 3 * gibbs + 1 * ((eos(1)-eos(2))^2 + (eos(1)-eos(3))^2 + (eos(2)-eos(3))^2)"

This way, the sequence produces close-to-mfe foldings with the targets (left) and the targets are close together in terms of energy. (1 * ) scales the two terms according to user choice.

binary, combining:

      • / :: the four basic operations ^ :: (^) generalized power function

binary, apply function to many targets:

sum max min :: run function over set of targets: sum(eos,1,2) or sum(eos,all)

unary, apply to single target:

eos :: energy of a structure: eos(1) ed :: ensemble defect of a structure: ed(3) partc :: constrained partition function: partc(1).

You probably want to use partc in conjunction with eos, where eos is modified by a small constant: "0.1 * eos(1) + partc(1)". eos guides the optimizer to the first viable sequence, after which the constrained partition function becomes active.

nullary, constant for the current sequence:

Ged :: global, weighted ensemble defect: Ged gibbs :: gibbs free energy of sequence mfe :: minimum free energy of sequence

special:

logMN :: requires four parameters logMN(0.2,0.3,0.3,0.2) penalizes according to given mono-nucleotide distribution in order of ACGU