ClonalFrameML - Efficient Inference of Recombination in Whole Bacterial Genomes
ClonalFrameML newick_file fasta_file output_file
[OPTIONS]
ClonalFrameML is a software package that performs efficient inference of
recombination in bacterial genomes. ClonalFrameML was created by Xavier
Didelot and Daniel Wilson. ClonalFrameML can be applied to any type of aligned
sequence data, but is especially aimed at analysis of whole genome sequences.
It is able to compare hundreds of whole genomes in a matter of hours on a
standard Desktop computer. There are three main outputs from a run of
ClonalFrameML: a phylogeny with branch lengths corrected to account for
recombination, an estimation of the key parameters of the recombination
process, and a genomic map of where recombination took place for each branch
of the phylogeny.
ClonalFrameML is a maximum likelihood implementation of the Bayesian software
ClonalFrame which was previously described by Didelot and Falush (2007). The
recombination model underpinning ClonalFrameML is exactly the same as for
ClonalFrame, but this new implementation is a lot faster, is able to deal with
much larger genomic dataset, and does not suffer from MCMC convergence issues
- -em
- true (default) or false Estimate parameters by a Baum-Welch
expectation maximization algorithm.
- -embranch
- true or false (default) Estimate parameters for each branch
using the EM algorithm.
- -rescale_no_recombination
- true or false (default) Rescale branch lengths for given
sites with no recombination model.
- -imputation_only
- true or false (default) Perform only ancestral state
reconstruction and imputation.
- -kappa
- value > 0 (default 2.0) Relative rate of transitions vs
transversions in substitution model
- -fasta_file_list
- true or false (default) Take fasta_file to be a white-space
separated file list.
- -xmfa_file
- true or false (default) Take fasta_file to be an XMFA
file.
- -ignore_user_sites
- sites_file Ignore sites listed in whitespace-separated
sites_file.
- -ignore_incomplete_sites
- true or false (default) Ignore sites with any ambiguous
bases.
- -use_incompatible_sites
- true (default) or false Use homoplasious and multiallelic
sites to correct branch lengths.
- -show_progress
- true or false (default) Output the progress of the maximum
likelihood routines.
- -chromosome_name
- name, eg "chr" Output importation status file in
BED format using given chromosome name.
- -min_branch_length
- value > 0 (default 1e-7) Minimum branch length.
- -reconstruct_invariant_sites
- true or false (default) Reconstruct the ancestral states at
invariant sites.
- -label_uncorrected_tree
- true or false (default) Regurgitate the uncorrected Newick
tree with internal nodes labelled.
- -prior_mean
- df "0.1 0.001 0.1 0.0001" Prior mean for R/theta,
1/delta, nu and M.
- -prior_sd
- df "0.1 0.001 0.1 0.0001" Prior standard
deviation for R/theta, 1/delta, nu and M.
- -initial_values
- default "0.1 0.001 0.05" Initial values for
R/theta, 1/delta and nu.
- -guess_initial_m
- true (default) or false Initialize M and nu jointly in the
EM algorithms.
- -emsim
- value >= 0 (default 0) Number of simulations to estimate
uncertainty in the EM results.
- -embranch_dispersion
- value > 0 (default .01) Dispersion in parameters among
branches in the -embranch model.
- -brent_tolerance
- tolerance (default .001) Set the tolerance of the Brent
routine for -rescale_no_recombination.
- -powell_tolerance
- tolerance (default .001) Set the tolerance of the Powell
routine for -rescale_no_recombination.
This manpage was written by Andreas Tille for the Debian distribution and can be
used for any other usage of the program.