Coder Social home page Coder Social logo

mat2's Introduction

MAT2

Manifold Alignment of Single-Cell Transcriptomes with Cell Triplets

Overview

MAT2 is designed to align multiple single-cell transcriptome datasets. The operation steps include:

  1. Manifold alignment based on contrastive learning: For a cell of interest C, MAT2 will select a cell Cp from the same cell type but a different dataset and a cell Cn from a different cell type to form a cell triplet (C, Cp, Cn). With contrastive learning, the distance between C and Cp in the latent manifold space will be much smaller than that between C and Cn, so as to achieve the alignment of single-cell transcriptome.
  2. Reconstruction of gene expression profile: With neural network decoders, consensus gene expression and batch-specific deviation will be reconstructed. Among them, consensus gene expression can be used for downstream analysis such as differential expression analysis and lineage tracing.

Installation

Firstly, please use git to clone the MAT2 repository.

git clone https://github.com/Zhang-Jinglong/MAT2.git
cd MAT2/

The Python packages that MAT2 depends on can be installed through conda. Run setup.py on the command line to install MAT2.

conda install --file requirements.txt --yes
python setup.py install

Usage

There is an example jupyter notebook demo/test.ipynb in the source code of MAT2, which demonstrates the method of aligning single-cell transcriptome datasets using MAT2.

The following is a brief description of the usage of MAT2 in Python:

Loading datasets

The test data can be found in the demo/ folder in the MAT2 repository.

import pandas as pd
from MAT2 import *

# MAT2 receives pandas DataFrame as input data.
# Multiple batches of data are concated into a matrix of size gene_num * cell_num.
data = pd.read_csv('data.csv', header=0, index_col=0)

# The row name of metadata should correspond to the cell name in data.
# Metadata must contain the 'batch' column, and must also contain the 'type' column when supervised.
metadata = pd.read_csv('metadata.csv', header=0, index_col=0)

# Anchor needs to be loaded only in unsupervised situations.
# Each record contains two cell numbers (cell in [0,cell_num-1]) and a score (score in [0.0,1.0]).
anchor = pd.read_csv('anchor.csv', header=0, index_col=0)

Building model & training

When providing cell type annotations for model building:

model = BuildMAT2(
    data=data,
    metadata=metadata,
    num_workers=2,
    use_gpu=True,
    mode='supervised',
    dropout_rate=0.3)
model.train(epochs=30)

When there is no cell type annotation but anchor is provided:

model = BuildMAT2(
    data=data,
    metadata=metadata,
    anchor=anchor,
    num_workers=2,
    use_gpu=True,
    mode='manual')
model.train(epochs=30)

When providing part of cell type annotations, run in semi-supervised mode:

model = BuildMAT2(
    data=data,
    metadata=metadata,
    anchor=anchor,
    num_workers=2,
    use_gpu=True,
    mode='semi-supervised')
model.train(epochs=30)

Testing

# test_data = data
# Calculate the reconstructed consensus gene expression.
rec = model.evaluate(test_data)
# Your own downstream analysis.

mat2's People

Contributors

cutelittledragon avatar zhang-jinglong avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.