Coder Social home page Coder Social logo

rimanb / cellannotationtutorial Goto Github PK

View Code? Open in Web Editor NEW

This project forked from baderlab/cellannotationtutorial

0.0 1.0 0.0 12.01 MB

Accompanying code for the tutorial: Annotating single cell transcriptomic maps using automated and manual methods

License: MIT License

HTML 100.00%

cellannotationtutorial's Introduction

Annotating single cell transcriptomic maps using automated and manual methods

Single-cell transcriptomics can profile thousands of cells in a single experiment and identify novel cell types, states and dynamics in a wide range of tissues and organisms. Standard experimental protocols and analysis workflows have been developed to create single-cell transcriptomic maps from tissues. This tutorial focuses on how to interpret these data to identify cell types, states and other biologically relevant patterns with the objective of creating an annotated map of cells.

In the written tutorial, we recommend a three step workflow including automatic cell annotation tools, manual cell annotation and verification. Frequently encountered challenges and strategies to address them are discussed. Guiding principles and specific recommendations for software tools and resources that can be used for each step are covered.

Accompanying code

To make recommendations by the tutorial more accessible, we have provided an R Notebook that guides the user through specific tools. Realistically, every single-cell map annotation case will be different and will likely not require the usage of all of these tools. For the purposes of this tutorial, the tools make use of publicly available available data and cover reference- and marker-based automatic annotation, manual annotation, and how to build a consensus set of cluster annotations. The R Notebook file can be downloaded and run on your own RStudio system. This will allow you to run through the steps interactively and at your own pace, with a full run of the file also creating a human-readable HTML file on your system.

Installation instructions

This code has been successfully run using R 4.0.3 and the following packages:

SingleCellExperiment_1.12.0*
Seurat_3.2.2
scater_1.18.3*
SCINA_1.2.0
devtools_2.3.2
dplyr_1.0.4*
scmap_1.12.0*
celldex_1.0.0*
SingleR_1.4.0*
ggplot2_3.3.3
Harmony_1.0**
cerebroApp_1.3.0**
msigdb_0.2.0**

"*" = packages must be installed by running BiocManager::install("package") instead of install.packages("package").
"**" = packages must be installed by from github using devtools (e.g. devtools::install_github("gitRepo/package").

If you haven't yet installed R on your system, you can install R at https://cran.r-project.org/ and R Studio at https://rstudio.com/products/rstudio/download/.

This tutorial takes advantage of open source data: The "query dataset" that we are annotating is available from 10X Genomics at https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz. Additionally, for marker-based annotation we will be using a list of marker genes from Diaz-Mejia JJ et al.. These datasets are automatically downloaded when the R code is run.

Content

The code consists of the following sections:

  1. Reference-based automatic annotation This section annotates the query dataset using a previously labeled reference dataset. Many tools exist to do this: we are going over scmap (cell and cluster) and SingleR. We will further explore how to use integration as a form of annotation using Harmony.

  2. Refining / Consensus annotations After finding multiple cell type labels for each cell using reference-based automatic annotation, we will keep the labels that most commonly occur across methods.

  3. Marker-based automatic annotation Instead of using a reference dataset to annotate the query dataset, we will input lists of marker genes associated with specific cell types. The program we have chosen to demonstrate here is SCINA.

  4. Manual annotation Here, we extract marker genes and associated pathways from the query dataset. To determine cell-type labels from this information, we would have to compare our differentially expressed genes and pathways to those described in the literature. To facilitate this process, we use Seurat and cerebroApp.

Timing

The associated R packages take about 5 minutes to install (R, itself, only takes a couple of minutes), and the code takes about 10 minutes to run.

cellannotationtutorial's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.