Coder Social home page Coder Social logo

uscbiostats / aphylo Goto Github PK

View Code? Open in Web Editor NEW
6.0 6.0 2.0 38.36 MB

Statistical inference of genetic functions in phylogenetic trees

Home Page: https://uscbiostats.github.io/aphylo

License: Other

C++ 16.54% R 53.08% M4 0.37% Shell 0.01% Makefile 0.40% TeX 29.52% Dockerfile 0.07%
annotations inference phylogenetics r r-package rcpparmadillo

aphylo's Introduction

USCbiostats

Name Description
AnnoQR R client wrap for AnnoQ API (https://github.com/blueOwl/AnnoQR) CRAN status CRAN downloads status
aphylo Statistical inference of genetic functions in phylogenetic trees CRAN status CRAN downloads status
bayesnetworks C++ program to fit Bayesian networks, illustrated with simulated data CRAN status CRAN downloads status
BinaryDosage Converts VCF files to a binary format CRAN status CRAN downloads status
causnet What the Package Does (One Line, Title Case) CRAN status CRAN downloads status
fdrci Permutation-Based FDR Point and Confidence Interval Estimation CRAN status CRAN downloads status
fmcmc A friendly MCMC framework CRAN status CRAN downloads status
GxEScanR R version of GxEScan CRAN status CRAN downloads status
HiLDA An R package for inferring the mutational exposures difference between groups
hJAM hJAM is a hierarchical model which unifies the framework of Mendelian Randomization and Transcriptome-wide association studies. CRAN status CRAN downloads status
iMutSig A web application to identify the most similar mutational signature using shiny CRAN status CRAN downloads status
jsPhyloSVG htmlwidgets for the jsPhyloSVG JavaScript library CRAN status CRAN downloads status
LUCIDus Latent and Unknown Cluster Analysis using Integrated Data CRAN status CRAN downloads status
MergeBinaryDosage R package for merging binary dosage files CRAN status CRAN downloads status
partition A fast and flexible framework for agglomerative partitioning in R CRAN status CRAN downloads status
pfamscanr An R client for EMBL-EBI’s PfamScan API CRAN status CRAN downloads status
polygons Flexible functions for computing polygons coordinates in R CRAN status CRAN downloads status
rphyloxml Read and write phyloXML files in R CRAN status CRAN downloads status
selectKSigs Selection of K in finding the number of mutational signatures
slurmR slurmR: A Lightweight Wrapper for Slurm CRAN status CRAN downloads status
xrnet R Package for Hierarchical Regularized Regression to Incorporate External Data CRAN status CRAN downloads status
xtune An R package for Lasso and Ridge Regression with differential penalization based on prior knowledge CRAN status CRAN downloads status

aphylo's People

Contributors

gvegayon avatar immaterial0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

aphylo's Issues

Before starting simulations

  • write tests for aphylo_formula.
  • write more tests for aphylo_mcmc using formulas (compare against raw MCMC)
  • rewrite predict.aphylo_estimates so that it uses the loglikelihood function created from the model call.
  • Include a warning when fixing parameters in the MLE estimation.

Plot loglikelihood

For the plot.aphylo_estimates method, use the modified version of the likelihood, i.e. using priors. Right now it is plotting the solution without considering the priors.

Extending the MCMC function

While we are intending to make the MCMC function its own R package, and hence, having its own repo, we will start working on the improvements right here. The idea of this Github-issue is to list all suggestions (wishes) for the MCMC function. For now, the devel road has the following:

  1. Implement Tempering MCMC.

  2. Implement Adaptative MCMC (global and local).

  3. Parallel MCMC (multiple chains).

Besides, we would like to include some benchmarks with the following packages: rjags, rstan, mcmc, etc. on some standard problems

Numerical instability and computational complexity of the LogLike function

After a few tests of the phylo_mcmc function I incidentally realized something that Duncan mentioned some time ago, the complexity of the algorithm is bad. On raw, and perhaps sloppy, terms, as a function of the number of 'functions' -p-, the algorithm performs (p*2^p) computations, ergo, for p = 1, 2, 4 we get

1x2^1 = 2
2x2^2 = 8
4x2^4 = 64

That said, I need to look more carefully to the LogLike function itself, and furthermore, to the probabilities function in C++ to see if we can improve efficiency. This is specially important for the MCMC case.

Here are some tests with the current state of the function:

> benchmark(
+     `1 fun`=with(dat1, LogLike(experiment, offspring, noffspring, c(.2,.2), c(.2,.2), .5)),
+     `2 fun`=with(dat2, LogLike(experiment, offspring, noffspring, c(.2,.2), c(.2,.2), .5)),
+     `3 fun`=with(dat3, LogLike(experiment, offspring, noffspring, c(.2,.2), c(.2,.2), .5)), 
+     replications=100, relative="elapsed")[,1:4]
   test replications elapsed relative
1 1 fun          100   0.005      1.0
2 2 fun          100   0.011      2.2
3 3 fun          100   0.027      5.4

The instability part comes from the fact that, as P increases (and n as well), the size of the rootnode probabilities tend to zero, which is another issue that I encountered throughout the process. That makes the LogLike undefined. Need to take a look on that too.

Visualization tools

Web-based

  • Suggested by Paul Thomas, we can use iTOL. Seems to have API.

  • This is a javascript SVG based visualization tool jsPhyloSVG. Seems to have API, and moreover, suitable to be used with Shiny. Elsevier uses this. So far, this is the tool that I like the most.

  • SILVA Tree Viewer is a webapp.

  • TreeLink is a webapp.

  • PhyD3 makes interactive visualizations. Suitable for Shiny.

In R

  • The ape::plot.phylo function has several options that I was not aware of. Now that I realized that the coordinates of the nodes are stored in .PlotPhyloEnv$last_plot.phylo as xx and yy, I can use this as input to create more personalized visualizations.

  • The ggtree R package, currently used here, is not working properly when handling the annotations.

Use `ape::as.phylo` methods

The phylo class of the ape package seems to be a very popular storage method for phylogenetic trees, so it is worthwhile either storing this package's objects as phylo objects or create conversion functions to take advantage of the ape package.

Using reorder(x, "postorder") already gives us a peeling sequence!

Just figured out that ape::reorder(., "postorder") can be mapped to the po-tree that we are creating with aphylo. This means, we don't need to have a special class of object to store the trees, it suffices to pass the second column of x$edges after calling the reorder function and use that as the peeling sequence. This is both encouraging and a shame since: (1) it makes more compatible still with the ape package, and (2) means that all the development on PO trees that I've done is useless 😭 (but that's OK!).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.