NAME

bolt - Efficient large cohorts genome-wide Bayesian mixed-model association testing

SYNOPSIS

bolt [ options]

DESCRIPTION

The BOLT-LMM software package currently consists of two main algorithms, the BOLT-LMM algorithm for mixed model association testing, and the BOLT-REML algorithm for variance components analysis (i.e., partitioning of SNP-heritability and estimation of genetic correlations).
 
The BOLT-LMM algorithm computes statistics for testing association between phenotype and genotypes using a linear mixed model. By default, BOLT-LMM assumes a Bayesian mixture-of-normals prior for the random effect attributed to SNPs other than the one being tested. This model generalizes the standard infinitesimal mixed model used by previous mixed model association methods, providing an opportunity for increased power to detect associations while controlling false positives. Additionally, BOLT-LMM applies algorithmic advances to compute mixed model association statistics much faster than eigendecomposition-based methods, both when using the Bayesian mixture model and when specialized to standard mixed model association.
 
The BOLT-REML algorithm estimates heritability explained by genotyped SNPs and genetic correlations among multiple traits measured on the same set of individuals. BOLT-REML applies variance components analysis to perform these tasks, supporting both multi-component modeling to partition SNP-heritability and multi-trait modeling to estimate correlations. BOLT-REML applies a Monte Carlo algorithm that is much faster than eigendecomposition-based methods for variance components analysis at large sample sizes.

OPTIONS

-h [ --help ] print help message with typical options
--helpFull
print help message with full option list
--bfile arg
prefix of PLINK .fam, .bim, .bed files
--bfilegz arg
prefix of PLINK .fam.gz, .bim.gz, .bed.gz files
--fam arg
PLINK .fam file (note: file names ending in .gz are auto-[de]compressed)
--bim arg
PLINK .bim file(s); for >1, use multiple --bim and/or {i:j}, e.g., data.chr{1:22}.bim
--bed arg
PLINK .bed file(s); for >1, use multiple --bim and/or {i:j} expansion
--geneticMapFile arg
Oxford-format file for interpolating genetic distances: tables/genetic_map_hg##.txt.gz
--remove arg
file(s) listing individuals to ignore (no header; FID IID must be first two columns)
--exclude arg
file(s) listing SNPs to ignore (no header; SNP ID must be first column)
--maxMissingPerSnp arg (=0.1)
QC filter: max missing rate per SNP
--maxMissingPerIndiv arg (=0.1) QC filter: max missing rate per person
--phenoFile arg
phenotype file (header required; FID IID must be first two columns)
--phenoCol arg
phenotype column header
--phenoUseFam
use last (6th) column of .fam file as phenotype
--covarFile arg
covariate file (header required; FID IID must be first two columns)
--covarCol arg
categorical covariate column(s); for >1, use multiple --covarCol and/or {i:j} expansion
--qCovarCol arg
quantitative covariate column(s); for >1, use multiple --qCovarCol and/or {i:j} expansion
--covarUseMissingIndic
include samples with missing covariates in analysis via missing indicator method (default: ignore such samples)
--reml
run variance components analysis to precisely estimate heritability (but not compute assoc stats)
--lmm
compute assoc stats under the inf model and with Bayesian non-inf prior (VB approx), if power gain expected
--lmmInfOnly
compute mixed model assoc stats under the infinitesimal model
--lmmForceNonInf
compute non-inf assoc stats even if BOLT-LMM expects no power gain
--modelSnps arg
file(s) listing SNPs to use in model (i.e., GRM) (default: use all non-excluded SNPs)
--LDscoresFile arg
LD Scores for calibration of Bayesian assoc stats: tables/LDSCORE.1000G_EUR.tab.gz
--numThreads arg (=1)
number of computational threads
--statsFile arg
output file for assoc stats at PLINK genotypes
--dosageFile arg
file(s) containing imputed SNP dosages to test for association (see manual for format)
--dosageFidIidFile arg
file listing FIDs and IIDs of samples in dosageFile(s), one line per sample
--statsFileDosageSnps arg
output file for assoc stats at dosage format genotypes
--impute2FileList arg
list of [chr file] pairs containing IMPUTE2 SNP probabilities to test for association
--impute2FidIidFile arg
file listing FIDs and IIDs of samples in IMPUTE2 files, one line per sample
--impute2MinMAF arg (=0)
MAF threshold on IMPUTE2 genotypes; lower-MAF SNPs will be ignored
--bgenFile arg
file(s) containing Oxford BGEN-format genotypes to test for association
--sampleFile arg
file containing Oxford sample file corresponding to BGEN file(s)
--bgenSampleFileList arg
list of [bgen sample] file pairs containing BGEN imputed variants to test for association
--bgenMinMAF arg (=0)
MAF threshold on Oxford BGEN-format genotypes; lower-MAF SNPs will be ignored
--bgenMinINFO arg (=0)
INFO threshold on Oxford BGEN-format genotypes; lower-INFO SNPs will be ignored
--statsFileBgenSnps arg
output file for assoc stats at BGEN-format genotypes
--statsFileImpute2Snps arg
output file for assoc stats at IMPUTE2 format genotypes
--dosage2FileList arg
list of [map dosage] file pairs with 2-dosage SNP probabilities (Ricopili/plink2 --dosage format=2) to test for association
--statsFileDosage2Snps arg
output file for assoc stats at 2-dosage format genotypes

SEE ALSO

https://data.broadinstitute.org/alkesgroup/BOLT-LMM/ Copyright © 2014-2018 Harvard University. Distributed under the GNU GPLv3+ open source license.