bamconsensus - compute rough consensus sequence from alignments
bamconsensus [reference=ref.fasta] < in.bam >out.fasta [options]
bamconsensus reads a BAM, SAM or CRAM file and computes a rough consensus based
on the alignments contained. The input file needs to be sorted in coordinate
order. The consensus is written as an alignment file on the standard output
channel. The sequence names in the output file are structured as
contig_A_B_C_D_E
where
A is the numeric reference id (0 based)
B is the name of the reference sequence as given in the BAM header
C is a numerical contig id within the contigs for a given reference id
D is the start position on the reference sequence (inclusive)
E is the end position on the reference sequence (exclusive)
The reference key specifying the name of a FastA reference sequence file is
required. The consensus is constructed by computing heavy paths in local
DeBruijn graphs. Consequently it is usually a patchwork of the haplotypes
present for diploid/polyploid genomes.
The following key=value pairs can be given:
reference=<ref.fasta>: reference FastA file (required)
verbose=<1>: Valid values are
- 1:
- print progress report on standard error
- 0:
- do not print progress report
T=<filename>: set the prefix for temporary file names
k=<32>: k-mer size used for consensus computation (maximum 32).
minlen=<50>: minimum length of alignments used (default 50).
inputformat=<bam>: input format
range=<>: input range to be processed. This option is only valid if
the input is a coordinate sorted and indexed BAM file
Written by German Tischler-Höhle.
Report bugs to <
[email protected]>
Copyright © 2019 German Tischler. License GPLv3+: GNU GPL version 3
<
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it. There is NO
WARRANTY, to the extent permitted by law.