Coder Social home page Coder Social logo

deep-dna's Introduction

deep-dna

A repository of deep learning models for DNA samples and sequences.

Setup

git clone https://github.com/DLii-Research/deep-dna
cd deep-dna
pip3 install -e .

Pre-trained/Fine-tuned Model Artifacts

The following are some pre-trained/fine-tuned models available on Weights & Biases.

Pre-trained Models

Taxonomic Classification Modelss

Dataset Preparation

Start by specifying the data locations.

synthetic_data_path=~/Datasets/Synthetic

Generating Synthetic Test Sets

To generate a synthetic test set, use the ./scripts/dataset/generate_synthetic_test.py utility script. The following produces a test set for the datasets used in this project.

for distribution in natural presence-absence; do
    for dataset in Hopland Nachusa SFD Wetland; do
        for synthetic_classifier in Naive Bertax Topdown; do
            echo "Dataset: $dataset, Synthetic Classifier: $synthetic_classifier, Distribution: $distribution"
            ./scripts/dataset/generate_synthetic_test.py \
                --synthetic-data-path $synthetic_data_path \
                --dataset $dataset \
                --synthetic-classifier $synthetic_classifier \
                --distribution $distribution \
                --sequence-length 150 \
                --num-subsamples 10
        done
    done
done

Taxonomy Evaluation

Below is a list of fine-tuned models available on Weights & Biases.

# DNABERT
export dnabert_taxonomy_naive=sirdavidludwig/dnabert-taxonomy/dnabert-taxonomy-naive-64d-150l:v0
export dnabert_taxonomy_bertax=sirdavidludwig/dnabert-taxonomy/dnabert-taxonomy-bertax-64d-150l:v0
export dnabert_taxonomy_topdown=sirdavidludwig/dnabert-taxonomy/dnabert-taxonomy-topdown-64d-150l:v0

# DNABERT (deeper)
export dnabert_taxonomy_topdown_deep=sirdavidludwig/dnabert-taxonomy/dnabert-taxonomy-topdown-deep-64d-150l:v0

# SetBERT
export setbert_taxonomy_topdown=sirdavidludwig/model-registry/setbert-taxonomy-topdown-64d-150l:v0

# SetBERT (leave-one-out controls)
export setbert_taxonomy_topdown_nhs=sirdavidludwig/setbert-taxonomy/setbert-taxonomy-topdown-nhs-64d-150l:v0
export setbert_taxonomy_topdown_nhw=sirdavidludwig/setbert-taxonomy/setbert-taxonomy-topdown-nhw-64d-150l:v0
export setbert_taxonomy_topdown_nsw=sirdavidludwig/setbert-taxonomy/setbert-taxonomy-topdown-nsw-64d-150l:v0
export setbert_taxonomy_topdown_hsw=sirdavidludwig/setbert-taxonomy/setbert-taxonomy-topdown-hsw-64d-150l:v0

A particular model can be evaluated on a dataset using the evaluation scripts.

SetBERT:

python3 ./scripts/taxonomy/eval_setbert.py \
    --synthetic-data-path $synthetic_data_path \
    --dataset Nachusa \
    --synthetic-classifier Naive \
    --distribution natural \
    --output-path ./logs/taxonomy_classification/setbert_topdown \
    --model-artifact $setbert_taxonomy_topdown \
    --num-gpus 1

deep-dna's People

Contributors

sirdavidludwig avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.