Coder Social home page Coder Social logo

jakublipinski / basenji Goto Github PK

View Code? Open in Web Editor NEW

This project forked from calico/basenji

0.0 1.0 0.0 155.82 MB

Sequential regulatory activity predictions with deep convolutional neural networks.

License: Apache License 2.0

Python 43.83% Jupyter Notebook 56.05% Shell 0.09% Makefile 0.03%

basenji's Introduction

Basenji

Sequential regulatory activity predictions with deep convolutional neural networks.

Basenji provides researchers with tools to:

  1. Train deep convolutional neural networks to predict regulatory activity along very long chromosome-scale DNA sequences
  2. Score variants according to their predicted influence on regulatory activity across the sequence and/or for specific genes.
  3. Annotate the distal regulatory elements that influence gene activity.
  4. Annotate the specific nucleotides that drive regulatory element function.

Akita

3D genome folding predictions with deep convolutional neural networks.

Akita provides researchers with tools to:

  1. Train deep convolutional neural networks to predict 2D contact maps along very long chromosome-scale DNA sequences
  2. Score variants according to their predicted influence on contact maps across the sequence and/or for specific genes.
  3. Annotate the specific nucleotides that drive genome folding.

Basset successor

This codebase offers numerous improvements and generalizations to its predecessor Basset, and I'll be using it for all of my ongoing work. Here are the salient changes.

  1. Basenji makes predictions in bins across the sequences you provide. You could replicate Basset's peak classification by simply providing smaller sequences and binning the target for the entire sequence.
  2. Basenji intends to predict quantitative signal using regression loss functions, rather than binary signal using classification loss functions.
  3. Basenji is built on TensorFlow, which offers myriad benefits, including distributed computing and a large and adaptive developer community.

However, this codebase is general enough to implement the Basset model, too. I have instructions for how to do that here.


Installation

Basenji/Akita were developed with Python3 and a variety of scientific computing dependencies, which you can see and install via requirements.txt for pip and environment.yml for Anaconda. For each case, we kept TensorFlow separate to allow you to choose the install method that works best for you. The codebase is compatible with the latest TensorFlow 2, but should also work with 1.15.

Run the following to install dependencies and Basenji with Anaconda.

    conda env create -f environment.yml
    conda install tensorflow (or tensorflow-gpu)
    python setup.py develop --no-deps

Alternatively, if you want to guarantee working versions of each dependency, you can install via a fully pre-specified environment.

    conda env create -f prespecified.yml
    conda install tensorflow (or tensorflow-gpu)
    python setup.py develop --no-deps

Or the following to install dependencies and Basenji with pip and setuptools.

    python setup.py develop
    pip install tensorflow (or tensorflow-gpu)

Then we recommend setting the following environmental variables.

  export BASENJIDIR=~/code/Basenji
  export PATH=$BASENJIDIR/bin:$PATH
  export PYTHONPATH=$BASENJIDIR/bin:$PYTHONPATH

To verify the install, launch python and run

    import basenji

Manuscripts

Models and (links to) data studied in various manuscripts are available in the manuscripts directory.


Documentation

At this stage, Basenji is something in between personal research code and accessible software for wide use. The primary challenge is uncertainty in what the best role for this type of toolkit is going to be in functional genomics and statistical genetics. The computational requirements don't make it easy either. Thus, this package is under active development, and I encourage anyone to get in touch to relate your experience and request clarifications or additional features, documentation, or tutorials.


Tutorials

These are a work in progress, so forgive incompleteness for the moment. If there's a task that you're interested in that I haven't included, feel free to post it as an Issue at the top.

basenji's People

Contributors

davek44 avatar gfudenberg avatar mlbileschi avatar chihuahua avatar jakublipinski avatar jaspersnoek avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.