bamauxmerge2 - merge information in unmapped and mapped BAM files
bamauxmerge2 [options] in_unmapped in_mapped
bamauxmerge2 reads and merges two BAM files which are expected to have the
following properties
- *
- the first file contains only unmapped reads and it's header
contains no SQ lines
- *
- the second file was produced by an aligner based on the
content of the first file.
- *
- both files are sorted in query name order into a single
alignment file.
The headers of the two files are merged in the following file:
- *
- the SQ lines contained in the header of the second file are
appended to the header of the first file to obtain the header of the
output file
- *
- all other header information from the second file is
discarded
The output records are constructed in the following way:
- 1.
- Take a record from the second file
- 2.
- Copy all aux fields from the corresponding record in the
first file which are not already present.
- 3.
- Reinsert clipped adapter bases/quality values stored in the
qs/qq by aux fields by fastqtobam2 and remove the qs/qq aux fields while
inserting appropriate soft clipping CIGAR operations.
- 4.
- Fix mate information like bamfixmateinformation.
- 5.
- Insert the mate CIGAR information fields MC and MS if the
mate is aligned.
- 6.
- Insert the MQ (mate quality) aux field.
The following key=value pairs can be given:
zz=<0|1>: replace read name by content of nn aux field. Valid
values are
- 1:
- replace read name
- 0:
- do not replace read name
calmdnm=<0|1>: recompute MD and NM aux fields. Valid values are
- 1:
- recompute MD and NM aux fields. This requires the
calmdnmreference key to be set to the name of an appropriate FastA
file.
- 0:
- do not recompute MD and NM aux fields
calmdnmreference=<>: reference FastA file.
replacecigar=<0|1>: replace M cigar operations by the appropriate =
and X operations. Valid values are
- 1:
- replace cigar operations. This requires the
calmdnmreference key to be set on invocation.
- 0:
- do not replace cigar operations
hash=<crc32prod>: hash used for computing sequence checksums. See
bamseqchksum for further information.
filehash=<md5>: hash used for computing output file checksums.
chksumfn=<>: file name used for storing sequence checksum
information. By default this information is not saved.
filehashfn=<>: file name used for storing file checksum
information. By default this information is not saved.
level=<-1|0|1|9>: set compression level of the output BAM file.
Valid values are
- -1:
- zlib/gzip default compression level
- 0:
- uncompressed
- 1:
- zlib/gzip level 1 (fast) compression
- 9:
- zlib/gzip level 9 (best) compression
verbose=<1>: Valid values are
- 1:
- print progress report on standard error
- 0:
- do not print progress report
tmpfile=<filename>: prefix for temporary files. By default the
temporary files are created in the current directory
md5=<0|1>: md5 checksum creation for output file. Valid values are
- 0:
- do not compute checksum. This is the default.
- 1:
- compute checksum. If the md5filename key is set, then the
checksum is written to the given file. If md5filename is unset, then no
checksum will be computed.
md5filename file name for md5 checksum if md5=1.
threads=<[1]>: number of threads used for processing. By default 1
thread is used. Set to 0 for using as many threads as CPU cores detected.
Written by German Tischler.
Report bugs to <
[email protected]>
Copyright © 2009-2019 German Tischler, © 2011-2013 Genome Research
Limited. License GPLv3+: GNU GPL version 3
<
http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it. There is NO
WARRANTY, to the extent permitted by law.