NAME

abpoa, abpoa.avx2, abpoa.avx, abpoa.sse4.1, abpoa.ssse3, abpoa.sse3, abpoa.generic - adaptive banded Partial Order Alignment

SYNOPSIS

abpoa [ options] <in.fa/fq> > cons.fa/msa.out/abpoa.gfa

DESCRIPTION

abPOA is an extended version of Partial Order Alignment (POA) that performs adaptive banded dynamic programming (DP) with an SIMD implementation. abPOA can perform multiple sequence alignment (MSA) on a set of input sequences and generate a consensus sequence by applying the heaviest bundling algorithm to the final alignment graph.
abPOA can generate high-quality consensus sequences from error-prone long reads and offer significant speed improvement over existing tools.
abPOA supports three alignment modes (global, local, extension) and flexible scoring schemes that allow linear, affine and convex gap penalties. It right now supports SSE2/SSE4.1/AVX2 vectorization.

OPTIONS

Alignment:
-m --aln-mode
INT alignment mode [0] 0: global, 1: local, 2: extension
-M --match
INT match score [2]
-X --mismatch
INT mismatch penalty [4]
-t --matrix
FILE scoring matrix file, '-M' and '-X' are not used when '-t' is used [Null] e.g., 'HOXD70.mtx, BLOSUM62.mtx'
-O --gap-open INT(,INT) gap opening penalty (O1,O2) [4,24]
-E --gap-ext
INT(,INT) gap extension penalty (E1,E2) [2,1] abPOA provides three gap penalty modes, cost of a g-long gap: - convex (default): min{O1+g*E1, O2+g*E2} - affine (set O2 as 0): O1+g*E1 - linear (set O1 as 0): g*E1
-s --amb-strand
ambiguous strand mode [False] for each input sequence, try the reverse complement if the current alignment score is too low, and pick the strand with a higher score
Adaptive banded DP:
-b --extra-b
INT first adaptive banding parameter [10] set b as < 0 to disable adaptive banded DP
-f --extra-f
FLOAT second adaptive banding parameter [0.01] the number of extra bases added on both sites of the band is b+f*L, where L is the length of the aligned sequence
Minimizer-based seeding and partition (only effective in global alignment mode):
-S --seeding
enable minimizer-based seeding and anchoring [False]
-k --k-mer
INT minimizer k-mer size [19]
-w --window
INT minimizer window size [10]
-n --min-poa-win INT
min. size of window to perform POA [500]
-p --progressive
build guide tree and perform progressive partial order alignment [False]
Input/Output:
-Q --use-qual-weight
take base quality score from FASTQ input file as graph edge weight [False]
-c --amino-acid
input sequences are amino acid (default is nucleotide) [False]
-l --in-list
input file is a list of sequence file names [False] each line is one sequence file containing a set of sequences which will be aligned by abPOA to generate a consensus sequence
-i --incrmnt
FILE incrementally align sequences to an existing graph/MSA [Null] graph could be in GFA or MSA format generated by abPOA
-o --output
FILE output to FILE [stdout]
-r --result
INT output result mode [0] - 0: consensus in FASTA format - 1: MSA in PIR format - 2: both 0 & 1 - 3: graph in GFA format - 4: graph with consensus path in GFA format - 5: consensus in FASTQ format
-d --maxnum-cons INT
max. number of consensus sequence to generate [1]
-q --min-freq
FLOAT min. frequency of each consensus sequence (only effective when -d/--num-cons > 1) [0.25]
-g --out-pog
FILE dump final alignment graph to FILE (.pdf/.png) [Null]
-h --help
print this help usage information
-v --version
show version number

SEE ALSO

For more information please refer to the paper published in Bioinformatics: https://dx.doi.org/10.1093/bioinformatics/btaa963