This repository contains code written to analyze single cell RNA sequencing data for adult and larva zebrafish retinal ganglion cells (RGCs) associated with the paper Kölsch, Yvonne, et al. "Molecular classification of zebrafish retinal ganglion cells links genes to cell types to behavior.", Neuron, in press. The analysis heavily relies on the R package Seurat and may be useful to users who wish to reproduce the results of the paper or try alternative analysis strategies. The raw sequence data associated with the publication is publicly available through the Gene Expression Omnibus (accession number GSE152842, while the gene expression matrices (GEMs) are available here as rds files.
Direct visualization of the results reported in the paper is available through the Broad Institute's Single Cell Portal (SCP). The final R data objects are also available via Google Drive for those who wish to perform their own analysis and visualization.
If you use data or code made available here in your work, please consider citing,
Kölsch, Yvonne, et al. "Molecular classification of zebrafish retinal ganglion cells links genes to cell types to behavior." bioRxiv (2020).
Please direct any questions associated with this repository to Joshua Hahn ([email protected]) or Karthik Shekhar ([email protected]).
This repository contains five R notebooks in total. One notebook is a tutorial for those interested in immediately working with and visualizing the data objects. The remaining four notebooks go through separate portions of the analysis, beginning from the raw count matrices. Each notebook is accompanied by an html document showing results. In addition to these notebooks, a variety of custom scripts are available in the utils folder to simplify the analysis.
This notebook provides a brief tutorial on how to interact with the final zebrafish objects available here. The tutorial covers the basic object architecture and a few visualization tools in Seurat.
This notebook guides users through the clustering of adult zebrafish RGCs using functionalities within the R package Seurat. Steps including loading the count matrices, setting up the Seurat object, initial clustering, data integration, removal of contaminant cell classes, and cluster visualization.
This notebook guides users through the clustering of larva zebrafish RGCs using functionalities within the R package Seurat. Steps including loading the count matrices, setting up the Seurat object, initial clustering, removal of contaminant cell classes, separation of mature and immature clusters, and cluster visualization.
This notebook explores expression of transcription factors, neuropeptides, and cell surface and adhesion molecules across larval and adult clusters by starting from initial gene databases curated from zfin.org.
This notebook builds supervised classification models using xgboost to compare the larval and adult clusters. One classification model is built to map adult cells to mature larval clusters. Adult clusters and mature larval clusters that map one to one are further explored to discover type specific and global changes in expression patterns. A second model is built to map mature larva cells to immature larval clusters to determine the extent to which diversification is complete at the larval stage.
This folder contains three scripts used for analysis and a fourth script demonstrating how to implement the xgboost algorithm.
Contains a variety of functions used for plotting and figure generation.
Contains functions to condense portions of the analysis.
Contains functions for implementing the xgboost algorithm.
An example script showing how to implement the xgboost algorithm.