Coder Social home page Coder Social logo

pmhn's Issues

Implement tree simulator

Implement the tree generative TreeMHN process. It should be based on the Algorithm 1 (pseudocode) from Supplementary Information.

Backend for loglikelihood calculation

Overview

Backend for calculating loglikelihood and its derivatives.
It can be based e.g., on LearnMHN package and joblib parallelisation.

Tasks:

  • Loglikelihood and gradient for a single genotype and MHN.
  • Loglikelihood and gradient for several genotypes and shared MHN.
  • Loglikelihood and gradient for several genotypes and several MHNs.

Description

We want to have functions of signatures:

def loglikelihood(genotype: Bool[Array, " M"], theta: Float[Array, "M M"]) -> float:
   ...
   
def gradient(genotype: Bool[Array, " M", theta: Float[Array, "M M") -> Float[Array, "M M"]:
 ...

implementing the loglikelihood and the gradient for a particular genotype.

Apart from that we want to have vectorized versions as described above.

PyMC Op for vanilla MHN

Create a PyMC Op object which can be used to calculate loglikelihood and the gradient of MHN.

Note: this issue depends on #3.

TreeMHN: implement gradient of the loglikelihood

We want to add the utility of calculating the gradient to the backend.

def gradient(tree: Tree, theta: np.ndarray) -> np.ndarray:
    ...

Note that it is not the priority at the start of the project: we will try to do the modelling as soon as possible, starting with a sequential Monte Carlo sampler and Metropolis transitions using small simulated data.

Only after initial experiments (we will probably see that it's not scalable), we'll consider switching to Hamiltonian Monte Carlo (which requires gradients).

Simulation framework

We want to be able to simulate a discrete Markov chain from the Markov process given by MHN.

Utilities:

  • Sampling a trajectory from initial state at time $t_0=0$ to $t_\mathrm{max}$.
  • Sampling time $t_\mathrm{max}$ from exponential distribution.

PyMC Op for personalised MHNs

Add a PyMC Op which takes full genotype matrix as well as an array of shape (patients, genes, genes) representing the MHNs and evaluating:

  • total loglikelihood
  • the derivative of the total loglikelihood with respect to the MHNs (which will again be of shape (patients, genes, genes))

Implement TreeMHN likelihood

We want to have a backend implementing a function of essentially the following signature:

def loglikelihood(tree: Tree, theta: np.ndarray) -> float:
    """Calculates the loglikelihood and the gradient.

    Args:
        tree: tree
        theta: unconstrained (log-) theta matrix of shape `(n_genes, n_genes)`

    Returns:
      loglikelihood, `log P(tree | theta)`
    """
    raise NotImplementedError

The implementation should be accompanied by unit tests, where the answer is known (either manually calculated or using original implementation)

Note that this task may be already too large to be accomplished in one sprint. After we discuss it, we can split it into several smaller subtasks.

Implement tree validation

As Xiang has noticed:

We may also want to add functions to check if the trees contain the following cases
A -> B -> B
B <- A -> B
because TreeMHN does not allow repeated mutations in the same lineage or identical siblings.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.