abpoa.sse4.1

NAME

abpoa, abpoa.avx2, abpoa.avx, abpoa.sse4.1, abpoa.ssse3, abpoa.sse3, abpoa.generic - adaptive banded Partial Order Alignment

SYNOPSIS

abpoa [ options] <in.fa/fq> > cons.fa/msa.out/abpoa.gfa

abPOA is an extended version of Partial Order Alignment (POA) that performs adaptive banded dynamic programming (DP) with an SIMD implementation. abPOA can perform multiple sequence alignment (MSA) on a set of input sequences and generate a consensus sequence by applying the heaviest bundling algorithm to the final alignment graph.

abPOA can generate high-quality consensus sequences from error-prone long reads and offer significant speed improvement over existing tools.

abPOA supports three alignment modes (global, local, extension) and flexible scoring schemes that allow linear, affine and convex gap penalties. It right now supports SSE2/SSE4.1/AVX2 vectorization.

OPTIONS

Alignment:

-m --aln-mode: INT alignment mode [0] 0: global, 1: local, 2: extension

-M --match: INT match score [2]

-X --mismatch: INT mismatch penalty [4]

-t --matrix: FILE scoring matrix file, '-M' and '-X' are not used when '-t' is used [Null] e.g., 'HOXD70.mtx, BLOSUM62.mtx'

-O --gap-open INT(,INT) gap opening penalty (O1,O2) [4,24]

-E --gap-ext: INT(,INT) gap extension penalty (E1,E2) [2,1] abPOA provides three gap penalty modes, cost of a g-long gap: - convex (default): min{O1+g*E1, O2+g*E2} - affine (set O2 as 0): O1+g*E1 - linear (set O1 as 0): g*E1

-s --amb-strand: ambiguous strand mode [False] for each input sequence, try the reverse complement if the current alignment score is too low, and pick the strand with a higher score

: Adaptive banded DP:

-b --extra-b: INT first adaptive banding parameter [10] set b as < 0 to disable adaptive banded DP

-f --extra-f: FLOAT second adaptive banding parameter [0.01] the number of extra bases added on both sites of the band is b+f*L, where L is the length of the aligned sequence

: Minimizer-based seeding and partition (only effective in global alignment mode):

-S --seeding: enable minimizer-based seeding and anchoring [False]

-k --k-mer: INT minimizer k-mer size [19]

-w --window: INT minimizer window size [10]

-n --min-poa-win INT: min. size of window to perform POA [500]

-p --progressive: build guide tree and perform progressive partial order alignment [False]

: Input/Output:

-Q --use-qual-weight: take base quality score from FASTQ input file as graph edge weight [False]

-c --amino-acid: input sequences are amino acid (default is nucleotide) [False]

-l --in-list: input file is a list of sequence file names [False] each line is one sequence file containing a set of sequences which will be aligned by abPOA to generate a consensus sequence

-i --incrmnt: FILE incrementally align sequences to an existing graph/MSA [Null] graph could be in GFA or MSA format generated by abPOA

-o --output: FILE output to FILE [stdout]

-r --result: INT output result mode [0] - 0: consensus in FASTA format - 1: MSA in PIR format - 2: both 0 & 1 - 3: graph in GFA format - 4: graph with consensus path in GFA format - 5: consensus in FASTQ format

-d --maxnum-cons INT: max. number of consensus sequence to generate [1]

-q --min-freq: FLOAT min. frequency of each consensus sequence (only effective when -d/--num-cons > 1) [0.25]

-g --out-pog: FILE dump final alignment graph to FILE (.pdf/.png) [Null]

-h --help: print this help usage information

-v --version: show version number

abpoa.sse4.1

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

SEE ALSO

Questions & Answers