RNAcalibrate - calibrate statistics of secondary structure hybridisations of
RNAs
RNAcalibrate [-h] [-d frequency_file] [-f
from,to] [-k sample_size] [-l
mean,std ] [-m max_target_length] [-n
max_query_length] [-u iloop_upper_limit]
[-v bloop_upper_limit] [-s] [-t
target_file] [-q query_file]
[target ] [query]
RNAcalibrate is a tool for calibrating minimum free energy (mfe)
hybridisations performed with RNAhybrid. It searches a random database that
can be given on the command line or otherwise generates random sequences
according to given sample size, length distribution parameters and
dinucleotide frequencies. To the empirical distribution of length normalised
minimum free energies, parameters of an extreme value distribution (evd) are
fitted. The output gives for each miRNA its name (or "command_line"
if it was submitted on the command line), the number of data points the evd
fit was done on, the location and the scale parameter. The location and scale
parameters of the evd can then be given to RNAhybrid for the calculation of
mfe p-values.
- -h
- Give a short summary of command line options.
- -d frequency_file
- Generate random sequences according to dinucleotide
frequencies given in frequency_file. See example directory for
example files.
- -f from,to
- Forces all structures to have a helix from position
from to position to with respect to the query. The first
base has position 1.
- -k sample_size
- Generate sample_size random sequences. Default value
is 5000.
- -l mean,std
- Generate random sequences with a normal length distribution
of mean mean and standard deviation std. Default values are
500 and 300, respectively.
- -m max_target_length
- The maximum allowed length of a target sequence. The
default value is 2000. This option only has an effect if a target file is
given with the -t option (see below).
- -n max_query_length
- The maximum allowed length of a query sequence. The default
value is 30. This option only has an effect if a query file is given with
the -q option (see below).
- -u iloop_upper_limit
- The maximally allowed number of unpaired nucleotides in
either side of an internal loop.
- -v bloop_upper_limit
- The maximally allowed number of unpaired nucleotides in a
bulge loop.
- -s
- Generate random sequences according to the dinucleotide
distribution of given targets (either with the -t option or on command
line. If no -t is given, either the last argument (if a -q is given) or
the second last argument (if no -q is given) to RNAcalibrate is taken as a
target). See -t option.
- -t target_file
- Without the -s option, each of the target sequences in
target_file is subject to hybridisation with each of the queries
(which either are from the query_file or is the one query given on
command line; see -q below). The sequences in the target_file have
to be in FASTA format, ie. one line starting with a > and directly
followed by a name, then one or more following lines with the sequence
itself. Each individual sequence line must not have more than 1000
characters.
With the -s option, the target (or target file) dinucleotide distribution is
counted, and random sequences are generated according to this
distribution.
If no -t is given, random sequences are generated as described above (see -d
option).
- -q query_file
- See -t option above. If no -q is given, the last argument
to RNAcalibrate is taken as a query.
The energy parameters are taken from:
Mathews DH, Sabina J, Zuker M, Turner DH. "Expanded sequence dependence of
thermodynamic parameters improves prediction of RNA secondary structure"
J Mol Biol., 288 (5), pp 911-940, 1999
This man page documents version 2.0 of RNAcalibrate.
Marc Rehmsmeier, Peter Steffen, Matthias Hoechsmann.
Character dependent energy values are only defined for [acgtuACGTU]. All other
characters lead to values of zero in these cases.
RNAhybrid, RNAeffective