samtools-ampliconclip - clip reads using a BED file
samtools ampliconclip [
-o out.file] [
-f stat.file]
[
--soft-clip] [
--hard-clip] [
--both-ends]
[
--strand] [
--clipped] [
--fail] [
--filter-len
INT] [
--fail-len INT] [
--no-excluded]
[
--rejects-file rejects.file] [
--original]
[
--keep-tag] [
--tolerance] [
--no-PG] [
-u]
-b bed.file in.file
Clips the ends of read alignments if they intersect with regions defined in a
BED file. While this tool was originally written for clipping read alignment
positions which correspond to amplicon primer locations it can also be used in
other contexts.
BED file entries used are chrom, chromStart, chromEnd and, optionally, strand.
There is a default tolerance of 5 bases when matching chromStart and chromEnd
to alignments.
By default the reads are soft clipped and clip is only done from the 5' end.
Some things to be aware of. While ordering is not significant, adjustments to
the left most mapping position (
POS) will mean that coordinate sorted
files will need resorting. In such cases the sorting order in the header is
set to unknown. Clipping of reads results in template length (
TLEN)
being incorrect. This can be corrected by
samtools fixmates. Any
MD and
NM aux tags will also be incorrect, which can be fixed by
samtools calmd. By default
MD and
NM tags are
removed though if the output is in CRAM format these tags will be
automatically regenerated.
-
-b FILE
- BED file of regions (e.g. amplicon primers) to be
removed.
-
-o FILE
- Output file name (defaults to stdout).
-
-f FILE
- File to write stats to (defaults to stderr).
- -u
- Output uncompressed SAM, BAM or CRAM.
- --soft-clip
- Soft clip reads (default).
- --hard-clip
- Hard clip reads.
- --both-ends
- Clip at both the 5' and the 3' ends where regions
match.
- --strand
- Use strand entry from the BED file to clip on the matching
forward or reverse alignment.
- --clipped
- Only output clipped reads. Filter all others.
- --fail
- Mark unclipped reads as QC fail.
-
--filter-len INT
- Filter out reads of INT size or shorter. In this case soft
clips are not counted toward read length. An INT of 0 will filter out
reads with no matching bases.
-
--fail-len INT
- As --filter-len but mark as QC fail rather then
filter out.
- --no-excluded
- Filter out any reads that are marked as QCFAIL or are
unmapped. This works on the state of the reads before clipping takes
place.
-
--rejects-file FILE
- Write any filtered reads out to a file.
- --original
- Add an OA tag with the original data for clipped
files.
- --keep-tag
- In clipped reads, keep the possibly invalid NM and MD tags.
By default these tags are deleted.
-
--tolerance INT
- The amount of latitude given in matching regions to
alignments. Default 5 bases.
- --no-PG
- Do not at a PG line to the header.
Written by Andrew Whitwham and Rob Davies, both from the Sanger Institute.
samtools(1),
samtools-sort(1),
samtools-fixmate(1),
samtools-calmd(1)
Samtools website: <
http://www.htslib.org/>