NAME
art_454 - Simulation of 454 PyrosequencingDESCRIPTION
ART is a set of simulation tools to generate synthetic next-generation sequencing reads. ART simulates sequencing reads by mimicking real sequencing process with empirical error models or quality profiles summarized from large recalibrated sequencing data. art_454 can be used for Simulation of 454 Pyrosequencing.USAGE
SINGLE-END SIMULATION
art_454 [-s] [-a ] [-t] [-r rand_seed] [ -p read_profile ] [ -c num_flow_cycles ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <FOLD_COVERAGE>PAIRED-END SIMULATION
art_454[-s] [-a ] [-t] [-r rand_seed] [ -p read_profile ] [ -c num_flow_cycles ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <FOLD_COVERAGE> <MEAN_FRAG_LEN> <STD_DEV>
AMPLICON SEQUENCING SIMULATION
art_454 [-s] [-a ] [-t] [-r rand_seed] [ -p read_profile ] [ -c num_flow_cycles ] <-A|-B> <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <#_READS/#_READ_PAIRS_PER_AMPLICON>OPTIONS
MANDATORY OPTIONS
- INPUT_SEQ_FILE - the filename of DNA/RNA reference sequences in FASTA format
- OUTPUT_FILE_PREFIX - the prefix or directory of output read data file (*.fq) and read alignment file (*.aln)
- FOLD_COVERAGE - the fold of read coverage over the reference sequences
- MEAN_FRAG_LEN - the average DNA fragment size for paired-end read simulation
- STD_DEV - the standard deviation of the DNA fragment size for paired-end read simulation
- #READS_PER_AMPLICON - number of reads per amplicon (for 5'end amplicon sequencing)
- #READ_PAIRS_PER_AMPLICON - number of read pairs per amplicon (for two-end amplicon sequencing)
OPTIONAL PARAMETERS
- -A indicate to perform single-end amplicon sequencing simulation
- -B indicate to perform paired-end amplicon sequencing simulation
- -M indicate to use CIGAR 'M' instead of '=/X' for alignment match/mismatch
- -a indicate to output the ALN alignment file
- -s indicate to output the SAM alignment file
- -d print out warning messages for debugging
- -t indicate to simulate reads from the built-in GS FLX Titanium profile [default: GS FLX profile]
- -r specify a fixed random seed for the simulation (to generate two identical datasets from two different runs)
- -c specify the number of flow cycles by the sequencer [ default: 100 for GS-FLX, and 200 for GS-FLX Titanium ]
- -p specify user's own read profile for simulation
EXAMPLES
- 1) singl-end simulation with 20X coverage
- art_454 -s seq_reference.fa ./outdir/single_dat 20
- 2) paired-end simulation with the mean fragment size 1500 and STD 20 using GS FLX Titanium platform
- art_454 -s -t seq_reference.fa ./outdir/paired_dat 10 1500 20
- 3) paired-end simulation with a fixed random seed
- art_454 -s -r 777 seq_reference.fa ./outdir/paired_fxSeed 10 2500 50
- 4) single-end amplicon sequencing with 10 reads per amplicon
- art_454 -A -s amplicon_ref.fa ./outdir/amp_single 10
- 5) paired-end amplicon sequencing with 10 read pairs per amplicon
- art_454 -B -s amplicon_ref.fa ./outdir/amp_paired 10
AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.February 2016 | art_454 3.19.15 |