samtools-merge - merges multiple sorted files into a single file
samtools merge [
options]
-o out.bam [
options]
in1.bam ...
inN.bam
samtools merge [
options]
out.bam in1.bam ...
inN.bam
Merge multiple sorted alignment files, producing a single sorted output file
that contains all the input records and maintains the existing sort order.
The output file can be specified via
-o as shown in the first synopsis.
Otherwise the first non-option filename argument is taken to be
out.bam
rather than an input file, as in the second synopsis. There is no default; to
write to standard output (or to a pipe), use either “
-o
-” or the equivalent using “
-” as the first
filename argument.
If
-h is specified the @SQ headers of input files will be merged into the
specified header, otherwise they will be merged into a composite header
created from the input headers. If in the process of merging @SQ lines for
coordinate sorted input files, a conflict arises as to the order (for example
input1.bam has @SQ for a,b,c and input2.bam has b,a,c) then the resulting
output file will need to be re-sorted back into coordinate order.
Unless the
-c or
-p flags are specified then when merging @RG and
@PG records into the output header then any IDs found to be duplicates of
existing IDs in the output header will have a suffix appended to them to
differentiate them from similar header records from other files and the read
records will be updated to reflect this.
The ordering of the records in the input files must match the usage of the
-n and
-t command-line options. If they do not, the output order
will be undefined. See
sort for information about record ordering.
- -1
- Use Deflate compression level 1 to compress the
output.
-
-b FILE
- List of input BAM files, one file per line.
- -f
- Force to overwrite the output file if present.
-
-h FILE
- Use the lines of FILE as `@' headers to be copied to
out.bam, replacing any header lines that would otherwise be copied
from in1.bam. (FILE is actually in SAM format, though any
alignment records it may contain are ignored.)
- -n
- The input alignments are sorted by read names rather than
by chromosomal coordinates
-
-o FILE
- Write merged output to FILE, specifying the filename
via an option rather than as the first filename argument. When -o
is used, all non-option filename arguments specify input files to be
merged.
- -t TAG
- The input alignments have been sorted by the value of TAG,
then by either position or name (if -n is given).
-
-R STR
- Merge files in the specified region indicated by STR
[null]
- -r
- Attach an RG tag to each alignment. The tag value is
inferred from file names.
- -u
- Uncompressed BAM output
- -c
- When several input files contain @RG headers with the same
ID, emit only one of them (namely, the header line from the first file we
find that ID in) to the merged output file. Combining these similar
headers is usually the right thing to do when the files being merged
originated from the same file.
Without -c, all @RG headers appear in the output file, with random
suffixes added to their IDs where necessary to differentiate them.
- -p
- Similarly, for each @PG ID in the set of files to merge,
use the @PG line of the first file we find that ID in rather than adding a
suffix to differentiate similar IDs.
- -X
- If this option is set, it will allows user to specify
customized index file location(s) if the data folder does not contain any
index file. See EXAMPLES section for sample of usage.
-
-L FILE
- BED file for specifying multiple regions on which the merge
will be performed. This option extends the usage of -R option and
cannot be used concurrently with it.
- --no-PG
- Do not add a @PG line to the header of the output
file.
-
-@, --threads INT
- Number of input/output compression threads to use in
addition to main thread [0].
- o
- Attach the RG tag while merging sorted alignments:
perl -e 'print "@RG\tID:ga\tSM:hs\tLB:ga\tPL:Illumina\n@RG\tID:454\tSM:hs\tLB:454\tPL:454\n"' > rg.txt
samtools merge -rh rg.txt merged.bam ga.bam 454.bam
The value in a RG tag is determined by the file name the read is
coming from. In this example, in the merged.bam, reads from
ga.bam will be attached RG:Z:ga, while reads from
454.bam will be attached RG:Z:454.
- o
- Include customized index file as a part of arguments:
samtools merge [options] -X <out.bam> </data_folder/in1.bam> [</data_folder/in2.bam> ... </data_folder/inN.bam>] </index_folder/index1.bai> [</index_folder/index2.bai> ... </index_folder/indexN.bai>]
Written by Heng Li from the Sanger Institute.
samtools(1),
samtools-sort(1),
sam(5)
Samtools website: <
http://www.htslib.org/>