seer - sequence element enrichment analysis
Sequence Element Enrichment Analysis
The .pheno file format is tab separated, two columns with sample name, one with
phenotype. Phenotypes of only 0 or 1 will be treated as binary, any other
value and the phenotype will be treated as quantitative. Therefore for missing
phenotype values the sample should simply be excluded from this file.
-
-k [ --kmers ] arg
- dsm kmer output file
-
-p [ --pheno ] arg
- .pheno metadata
-
--struct arg
- mds values from kmds
-
--covar_file arg
- file containing covariates
-
--covar_list arg
- list of columns covariates to use. Format is 1,2q,3 (use q
for quantitative)
-
--threads arg (=1)
- number of threads. Suggested: 4
- --no_filtering
- turn off all filtering and peform tests on all kmers
input
-
--max_length arg (=100)
- maximum kmer length
-
--maf arg (=0.01)
- minimum kmer frequency
-
--min_words arg
- minimum kmer occurrences. Overrides --maf
-
--chisq arg (=10e-5)
- p-value threshold for initial chi squared test. Set to 1 to
show all
-
--pval arg (=10e-8)
- p-value threshold for final logistic test. Set to 1 to show
all
- --print_samples
- print lists of samples significant kmers were found in
- --version
- prints version and exits
-
-h [ --help ]
- full help message
Basic usage:
- seer -k dsm_input.txt.gz --pheno metadata.pheno >
significant_kmers.txt
To use the kmds output, increase execution speed and give the most complete
output
- seer -k filtered.gz --pheno metadata.pheno --struct
filtered.dsm --threads 4 --print_samples
This manpage was written by Andreas Tille for the Debian distribution and can be
used for any other usage of the program.