art_SOLiD - Simulation of Applied Biosystems SOLiD Sequencing
ART is a set of simulation tools to generate synthetic next-generation
sequencing reads. ART simulates sequencing reads by mimicking real sequencing
process with empirical error models or quality profiles summarized from large
recalibrated sequencing data.
art_SOLiD can be used for Simulation of Applied Biosystems SOLiD Sequencing.
art_SOLiD [ options ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX>
<LEN_READ> <FOLD_COVERAGE>
art_SOLiD [ options ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX>
<LEN_READ> <FOLD_COVERAGE> <MEAN_FRAG_LEN> <STD_DEV>
art_SOLiD [ options ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX>
<LEN_READ_F3> <LEN_READ_F5> <FOLD_COVERAGE>
<MEAN_FRAG_LEN> <STD_DEV>
art_SOLiD [ options ]
-A s <INPUT_SEQ_FILE>
<OUTPUT_FILE_PREFIX> <LEN_READ> <READS_PER_AMPLICON>
art_SOLiD [ options ]
-A m <INPUT_SEQ_FILE>
<OUTPUT_FILE_PREFIX> <LEN_READ> <READ_PAIRS_PER_AMPLICON>
art_SOLiD [ options ]
-A p <INPUT_SEQ_FILE>
<OUTPUT_FILE_PREFIX> <LEN_READ_F3> <LEN_READ_F5>
<READ_PAIRS_PER_AMPLICON>
- INPUT_SEQ_FILE - filename of DNA/RNA reference sequences in
FASTA format
- OUTPUT_FILE_PREFIX - prefix or directory for all output
read data files
- FOLD_COVERAGE - fold of read coverage over the reference
sequences
- LEN_READ - length of F3/R3 reads
- LEN_READ_F3 - length of F3 reads for paired-end read
simulation
- LEN_READ_F5 - length of F5 reads for paired-end read
simulation
- READS_PER_AMPLICON - number of reads per amplicon
- READ_PAIRS_PER_AMPLICON - number of read pairs per
amplicon
- MEAN_FRAG_LEN - mean DNA/RNA fragment size for
matepair/paired-end read simulation
- STD_DEV - standard deviation of the DNA/RNA fragment sizes
for matepair/paired-end read simulation
-
-A specify the read type for amplicon sequencing
simulation (s:single-end, m: matepair, p: paired-end)
-
-M indicate to use CIGAR 'M' instead of '=/X' for
alignment match/mismatch
-
-s indicate to generate a SAM alignment file
-
-r specify the random seed for the simulation
-
-f specify the scale factor adjusting error rate
(e.g., -f 0 for zero-error rate simulation)
-
-p specify user's own read profile for
simulation
- 1) singl-end 25bp reads simulation at 10X coverage
- art_SOLiD -s seq_reference.fa ./outdir/single_dat 25
10
- 2) singl-end 75bp reads simulation at 20X coverage with
user's error profile
- art_SOLiD -s -p
../SOLiD_profiles/profile_pseudo ./seq_reference.fa ./dat_userProfile 75
20
- 3) matepair 35bp (F3-R3) reads simulation at 20X coverage
with DNA/RNA MEAN fragment size 2000bp and STD 50
- art_SOLiD -s seq_reference.fa ./outdir/matepair_dat
35 20 2000 50
- 4) matepair reads simulation with a fixed random seed
- art_SOLiD -r 777 -s seq_reference.fa
./outdir/matepair_fs 50 10 1500 50
- 5) paired-end reads (75bp F3, 35bp F5) simulation with the
MEAN fragment size 250 and STD 10 at 20X coverage
- art_SOLiD -s seq_reference.fa ./outdir/paired_dat 75
35 50 250 10
- 6) amplicon sequencing with 25bp single-end reads at 100
reads per amplicon
- art_SOLiD -A s -s amp_reference.fa
./outdir/amp_single 25 100
- 7) amplicon sequencing with 50bp matepair reads at 80 read
pairs per amplicon
- art_SOLiD -A m -s amp_reference.fa
./outdir/amp_matepair 50 80
- 8) amplicon sequencing with paired-end reads (35bp F3, 25bp
F5 reads) at 50 pairs per amplicon
- art_SOLiD -A p -s amp_reference.fa
./outdir/amp_pair 35 25 50
This manpage was written by Andreas Tille for the Debian distribution and can be
used for any other usage of the program.