Coder Social home page Coder Social logo

qtlbim's Introduction

qtlbim

The goal of qtlbim is to perform multivariate bayesian QTL mapping after the methods developed in several journal articles by Yi, et al. Currently, it works only with two-parent crosses, but we’d like to develop methods for multiparent crosses like the Diversity Outbred mice.

This particular repository was forked from the CRAN Github repository and subsequently updated to work with new versions of gcc.

Installation

qtlbim isn’t on CRAN right now.

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("fboehm/qtlbim")

Detailed use instructions are available in the vignettes. We reproduce below part of the text from one vignette.

Example

This is a basic example:

library(qtlbim)
#> Loading required package: qtl
#> Loading required package: lattice
#> Loading required package: coda
#> Loading required package: tools
#> Loading required package: MASS
## basic example code

This document focuses on the hyper dataset from the qtl (Broman et al. 2003) R package, which was initially studied in Sugiyama et al. (2001). The hyper dataset is stored in qtl as a cross object. The R/qtlbim package processes this cross object to create a qb object called qbHyper, containing the MCMC samples. The hyper demo shows how this is done.

It is possible to directly load the already saved qb object with the data command. Following this by a call to qb.cross extracts a version of the cross object used to create the qb object.

data(qbHyper)
hyper <- qb.cross(qbHyper)

Alternatively, a object can be created by the following sequence of commands. First load the hyper data set from R/qtl, and subset on the autosomes, as R/qtlbim does not yet handle the X chromosome properly.

data(hyper)
hyper <- subset(hyper, chr=1:19)

To run the MCMC sampler on the hyper data we use the command

hyper <- qb.genoprob(hyper, step=2) 
# Now run the MCMC model selection algorithm.
# This can take several minutes.
qbHyper <- qb.mcmc(hyper, pheno.col = 1, seed = 1616)
#> qb.mcmc is running 60,600 iterations. The current iterations are saved: 
#> 200
#> 400
#> 600
#> 800
#> 1000
#> 1200
#> 1400
#> 1600
#> 1800
#> 2000
#> 2200
#> 2400
#> 2600
#> 2800
#> 3000
#> MCMC sample has been saved to: ./bp_Feb-14-190121.
#> Bayesian MCMC took 0.74 minutes.
#> Warning: Removing external directory ./bp_Feb-14-190121

The option seed sets the pseudorandom number seed so that this run can be repeated exactly. The qb object called qbHyper is used throughout this vignette.

Creating Bayesian interval mapping MCMC samples

This section describes in more detail how to create Markov chain Monte Carlo (MCMC) samples from the Bayesian posterior to be used for QTL mapping. The next step to mapping with the R/qtl package would be to use the function calc.genoprob to create genotype probabilities based on a Hidden Markov model. However, for Bayesian model selection, we replace calc.genoprob with the R/qtlbim function qb.genoprob. The function qb.genoprob performs some bookkeeping before calling calc.genoprob with the variable stepwidth option for pseudomarker positions. The probabilities for genotypes at pseudomarkers and at markers with missing data are calculated by calc.genoprob from the observed marker data using the multipoint method (Jiang and Zeng, 1997).

The MCMC samples are created by qb.mcmc after running qb.genoprob. In the simplest case, MCMC samples are created with the following two calls:

hyper <- qb.genoprob(hyper, step=2) 
qbHyper <- qb.mcmc(hyper, pheno.col = 1)
#> qb.mcmc is running 60,600 iterations. The current iterations are saved: 
#> 200
#> 400
#> 600
#> 800
#> 1000
#> 1200
#> 1400
#> 1600
#> 1800
#> 2000
#> 2200
#> 2400
#> 2600
#> 2800
#> 3000
#> MCMC sample has been saved to: ./bp_Feb-14-190206.
#> Bayesian MCMC took 0.75 minutes.
#> Warning: Removing external directory ./bp_Feb-14-190206

By default the qb.mcmc function prints out progress messages of the number of iterations completed. These progress messages can be suppressed by setting verbose=FALSE. Arguments for the routines qb.data and qb.model, described below, can be passed through qb.mcmc. Otherwise, default values are used. The detail below for qb.data.qb.model, andqb.mcmc` routines could be skipped in favor of default settings.

The function qb.data specifies the traits to be analyzed, their underlying distribution, the random and/or fixed covariates and whether to standardize or to use a boxcox transformation. Note that, the cross object can have several phenotypes and some of which could be used as covariates.

qbData <- qb.data(hyper, pheno.col = 1, trait = "normal", fixcov = 0, rancov = 0)

The R/qtlbim routines handle normal, binary and ordinal data. In addition, the user can specify fixed (fixcov) and random (rancov) covariate(s). [The pheno.col, fixcov, and rancov values can be numeric indices to the phenotype names, or character strings with exact phenotype names.] Fixed covariates can be included as interacting covariates with the intcov option to qb.model (see below).

The function qb.model defines the model parameters, using defaults that work well in most settings. Users are probably most interested in specifying if epistasis is considered, the prior expected number of main effect QTLs (main.nqtl), and the prior expected total number of QTLs (mean.nqtl), which includes additional QTLs with only epistatic effects. A user may set main.nqtl and mean.nqtl based on previous QTL analysis, for example using R/qtl. Setting the maximum number of QTLs overall (max.nqtl) or per chromosome (chr.nqtl), and setting the minimum interval between linked QTL, can be used to restrict sampling as needed.

Typically a real data set has several traits which can be considered as covariates. The intcov option specifies which covariate(s) can interact with QTLs, or equivalently, which environmental factors may have GxE interactions. The intcov should be a vector of 0s and 1s of the same length as the fixcov option specified for qb.data (see above).

qbModel <- qb.model(hyper, epistasis = TRUE, main.nqtl = 3, 
interval = rep(5,nchr(hyper)), chr.nqtl = rep(2,nchr(hyper)), 
depen = FALSE, prop = c(0.5, 0.1, 0.05))

The function qb.mcmc creates MCMC samples on the data and model specified. The results are initially saved in a unique directory under mydir, which is removed at completion of the command. Options for qb.data and qb.model can be passed directly to qb.mcmc, or as the objects created above.

qb <- qb.mcmc(hyper, data = qbData, model = qbModel, mydir = ".", 
              n.iter = 3000, n.thin = 20, genoupdate = TRUE)
#> qb.mcmc is running 60,600 iterations. The current iterations are saved: 
#> 200
#> 400
#> 600
#> 800
#> 1000
#> 1200
#> 1400
#> 1600
#> 1800
#> 2000
#> 2200
#> 2400
#> 2600
#> 2800
#> 3000
#> MCMC sample has been saved to: ./bp_Feb-14-190251.
#> Bayesian MCMC took 0.75 minutes.
#> Warning: Removing external directory ./bp_Feb-14-190251

The genoupdate option simulates pseudomarker and missing marker genotypes if TRUE, or uses a Haley-Knott (1992) type approach if FALSE; the latter is faster, but not generally recommended if there are many missing genotypes or selective genotyping. n.iter samples are saved, thinning to one in n.thin from the MCMC samples to reduce serial correlation. That is, n.iter * n.thin samples are drawn, after an initial n.burnin samples (1% of total by default) are discarded to allow the chain to converge closer to the posterior distribution.

qtlbim's People

Contributors

byandell avatar eddelbuettel avatar fboehm avatar

Stargazers

 avatar  avatar  avatar

Forkers

eddelbuettel

qtlbim's Issues

Add README.md

I'll steal example code from the Rnw files to make a README.md file

multiple trait support in qtlbim

I've read your 2012 Methods Mol Biol. paper and am trying to follow your example. I've noticed that the multiple.trait argument is not used in, e.g. qb.sim.cross. It is handled by some of the code and so I've been able to make some adjustments in order to use this argument. Is this functionality available in any archived versions, or is it being eliminated for some reason? Thanks for the great library and any hints about using it for multiple traits.

Compiler warnings

Using my default settings, builds are verbose:

edd@rob:~/git/qtlbim(master)$ install.r                                                                  
* installing *source* package found in current working directory ...
* installing *source* package ‘qtlbim’ ...                                                                                                                                                                         files ‘demo/00Index’, ‘demo/qb.coda.tour.R’, ‘demo/qb.covar.tour.R’, ‘demo/qb.hyper.tour.R’, ‘demo/qb.mcmc.tour.R’, ‘demo/qb.plot.tour.R’, ‘demo/qb.scan.tour.R’, ‘demo/qb.sim.tour.R’, ‘demo/qb.tour.R’, ‘demo/qtl
bim.R’, ‘inst/doc/prototype.qtl.hyper.slides.Rnw’, ‘inst/doc/prototype.qtl.hyper.slides.pdf’, ‘inst/doc/qtlbim.Rnw’, ‘inst/doc/qtlbim.pdf’, ‘inst/external/hyper.paper.extra.Rnw’, ‘inst/external/hyper.slide.extra
.Rnw’, ‘inst/external/prototype.qtl.hyper.paper.Rnw’ are missing
file ‘DESCRIPTION’ has the wrong MD5 checksum
** using staged installation                                                                             
** libs
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c GlobalVars.c -o GlobalVars.o
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c GlobalVars_SingleTrait.c -o GlobalVars_SingleTrait.o
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c MatrixUtils.c -o MatrixUtils.o
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c MultipleTraitsMCMC.c -o MultipleTraitsMCMC.o
MultipleTraitsMCMC.c: In function ‘MT_QTLPOSITION’:                                                      
MultipleTraitsMCMC.c:835:7: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
  835 |       if(UPDATEGENO==0) MT_Coefficient0(I,L,trait,QLOC[trait][L]);   
      |       ^~
MultipleTraitsMCMC.c:837:4: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
  837 |    for(K=0;K<NC;K++) COEF[trait][I][L][K]=X[K];
      |    ^~~
MultipleTraitsMCMC.c: In function ‘QTLPOSITION_SameLocation’:
MultipleTraitsMCMC.c:924:7: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
  924 |       if(UPDATEGENO==0) MT_Coefficient0(I,L,0,QLOC[0][L]);
      |       ^~
MultipleTraitsMCMC.c:925:4: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
  925 |    for(K=0;K<NC;K++) COEF[0][I][L][K]=X[K];
      |    ^~~
MultipleTraitsMCMC.c: In function ‘multipleTraitsMCMC’:
MultipleTraitsMCMC.c:2938:9: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
 2938 |         if(DiffLocation==0)
      |         ^~
MultipleTraitsMCMC.c:2950:11: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
 2950 |           if(DiffLocation==1)
      |           ^~
MultipleTraitsMCMC.c:2440:24: warning: unused variable ‘S’ [-Wunused-variable]
 2440 | int NU=6; double H=0.1,S=2,TAU; //NU=degrees of freedom, H=heritability, TAU=scale
      |                        ^
MultipleTraitsMCMC.c:3437:12: warning: suggest parentheses around comparison in operand of ‘&’ [-Wparentheses]
 3437 | if(MULTIPLE>=1 & PH1 == 0) // only print sigma once per iteration.
      |    ~~~~~~~~^~~
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c RMultipleTraitsMCMCSetup.c -o RMultipleTraitsMCMCSetup.o
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c ROutputManager.c -o ROutputManager.o
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c RSingleTraitMCMCSetup.c -o RSingleTraitMCMCSetup.o
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c SingleTraitMCMC.c -o SingleTraitMCMC.o
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c SingleTraitMCMCSamplingRoutines.c -o SingleTraitMCMCSamplingRoutines.o
ccache gcc -I"/usr/share/R/include" -DNDEBUG     -fcommon -fpic  -g -O3 -Wall -pipe -pedantic  -std=gnu99 -c StatUtils.c -o StatUtils.o
StatUtils.c: In function ‘RANDOM’:
StatUtils.c:16:2: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
   16 |  if(U>=1.0) U=1.0-(1e-10); if(U<=0.0) U=1e-10;
      |  ^~
StatUtils.c:16:28: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
   16 |  if(U>=1.0) U=1.0-(1e-10); if(U<=0.0) U=1e-10;
      |                            ^~
ccache gcc -Wl,-S -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o qtlbim.so GlobalVars.o GlobalVars_SingleTrait.o MatrixUtils.o MultipleTraitsMCMC.o RMultipleTraitsMCMCSetup.o ROutputManager.o RSingleTraitMCMCSetup.o SingleTraitMCMC.o SingleTraitMCMCSamplingRoutines.o StatUtils.o -L/usr/lib/R/lib -lR
installing to /usr/local/lib/R/site-library/00LOCK-qtlbim/00new/qtlbim/libs
** R
** data
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (qtlbim)
edd@rob:~/git/qtlbim(master)$ 

PR should be forthcoming in a few.

Consider converting Rnw files to Rmd

This may or may not be worth the effort. I can probably write a bash script to find and replace the code chunk delimiters to switch between the two formats (Rnw and Rmd).

Consider 'cutting ties' with the repo you forked from

Your repo is currently listed as

image

which means it is not a 'genuine' repo (as it is stated as a fork). As described in this page of the GitHub documentation, contributions to such repositories "do not count". It is not the most important thing in the world but if you plan to make more changes here you may want to follow the route described on that page and ask GitHub to 'cut' you from the CRAN/qtlbim repo you started from. I think I have done that twice myself in the last decade, and it only takes an email ... and within a few days they usually respond. And you have what seems like a valid case of rejuvenating a dormant -- yet useful! -- repo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.