Coder Social home page Coder Social logo

andreaskapou / scmet Goto Github PK

View Code? Open in Web Editor NEW
20.0 2.0 1.0 3.09 MB

Bayesian modelling of DNA methylation heterogeneity at single-cell resolution

R 69.47% C++ 27.62% Stan 2.91%
bayesian-inference generalised-linear-models methylation-analysis single-cell heterogeneity hierarchical-models

scmet's Introduction

scMET

Bayesian modelling of DNA methylation heterogeneity at single-cell resolution

Background

Here we introduce scMET, a Bayesian framework for the analysis of single cell DNA methylation data. This modelling approach combines a hierarchical beta-binomial specification with a generalised linear model framework with the aim of capturing biological overdispersion and overcome data sparsity by sharing information across cells and genomic features.

To disentangle technical from biological variability and overcome data sparsity, scMET couples a hierarchical BB model with a GLM framework (Fig.1a-b). For each cell i and region j, the input for scMET is the number of CpG sites that are observed to be methylated (Y) and the total number of sites for which methylation status was recorded (n). The BB model uses feature-specific mean parameters mu to quantify overall DNAm across all cells and biological overdispersion parameters gamma as a proxy for cell-to-cell DNAm heterogeneity. These parameters capture the amount of variability that is not explained by binomial sampling noise, which would only account for technical variation.

The GLM framework is incorporated at two levels. Firstly, to introduce feature-specific covariates x (e.g. CpG density) that may explain differences in mean methylation mu across features. Secondly, similar to Eling2018, we use a non-linear regression framework to capture the mean-overdispersion trend that is typically observed in high throughput sequencing data, such as scBS-seq (Fig.1c). This trend is used to derive residual overdispersion parameters epsilon: a measure of cell-to-cell variability that is not confounded by mean methylation. Feature-specific parameters are subsequently used for: (i) feature selection, to identify highly variable features (HVFs) that drive cell-to-cell epigenetic heterogeneity (Fig.1d) and (ii) differential methylation testing, to highlight features that show differences in DNAm mean or variability between specified groups of cells (Fig.1e).

Overview of the scMET model is shown below:

Installation

# Install stable version from Bioconductor
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("scMET")

## Or development version from Github
# install.packages("remotes")
remotes::install_github("andreaskapou/scMET")

Installation issue requiring the V8 library (outdated)

This is an old issue with Rstan, keeping here for reference.

The scMET package depends heavily on Rstan, whose newer version depends on the V8 library (see this issue: stan-dev/rstan#831). For users that don't have the system-level V8 dependency pre-installed, there are two approaches.

  1. (Recommended) Download a static libv8 library when installing on Linux:
Sys.setenv(DOWNLOAD_STATIC_LIBV8 = 1)
install.packages("V8")

see this blogpost for more details: https://ropensci.org/blog/2020/11/12/installing-v8/.

  1. If the above approach does not work, then install the rstan package from this branch that removes the dependency and the code that calls it and makes no other changes:
install_github("makoshark/rstan", ref="develop", subdir="rstan/rstan")

and then proceed installing scMET as above. If during the installation you still get the error about installing V8, try and set dependencies = FALSE when calling the install_github function.

Online vignette

scMET is not yet part of the Bioconductor. Until then, an online vignette can be found in https://rpubs.com/cakapourani/scmet-analysis.

Citation:

Kapourani, C. A., Argelaguet, R., Sanguinetti, G., & Vallejos, C. A. (2021). scMET: Bayesian modeling of DNA methylation heterogeneity at single-cell resolution. Genome biology, 22(1), 1-21.

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-021-02329-8

scmet's People

Contributors

alanocallaghan avatar andreaskapou avatar andrjohns avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

khl0798

scmet's Issues

data conversion process

Do you have a video or a manual providing the data conversion process for gastrulation?

How can I apply scMET on real data? And does it do imputation?

Hello dear author, it's exciting to find this new tool!
I've been following Mellisa but gave up due to the lack of detailed annotation of genome features, could you please give a brief suggestion what's the difference of scMET and Melissa, and how to determine which tool is more suitable for my data?
Also, when I check "Online vignette", I found "scMET on real data: TODO", can I perform the same analysis according to the synthetic data tutorial?
Thanks!

Apply scMET on sliding windows

Hello Dear Andreas,

I was trying to apply scMET on a large-scale scBS-seq dataset using non-overlapping sliding windows of 20kb. I noticed that you has suggested in the scMET paper:

In the spirit of divide-and-conquer schemes, we bypass this problem via a parallelization strategy in which we apply scMET separately to each chromosome. Feature-specific estimates obtained for each chromosome can be combined post hoc when performing HVF selection and differential analyses.

Do you have any instructions on how to combine the estimates post hoc? Or any functions developed for that purpose? Thanks in advance! Looking forward to hearing you back.

Best,
Ning

How to use multiple CPU threads?

Hi there,
thanks for making this tool! I'm currently trying it out and noticed that scMET uses only a single core even when I set n_cores to a higher number. Is this a bug or do I have to register the threads first (if so, how)?
Cheers, Lukas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.