Coder Social home page Coder Social logo

lehner-lab / mochi Goto Github PK

View Code? Open in Web Editor NEW
28.0 4.0 2.0 7.58 MB

Neural networks to fit interpretable models and quantify energies, energetic couplings, epistasis and allostery from deep mutational scanning data

License: MIT License

Python 99.57% Shell 0.43%

mochi's Introduction

example workflow install with bioconda Anaconda-Server Badge Anaconda-Server Badge Anaconda-Server Badge

MoCHI

Welcome to the GitHub repository for MoCHI: Neural networks to fit interpretable models and quantify energies, energetic couplings, epistasis and allostery from deep mutational scanning data.

Table Of Contents

  1. Installation
  2. Usage
    1. Option A: MoCHI command line tool
    2. Option B: Custom Python script
    3. Demo
  3. Manual
  4. Bugs and feedback
  5. Citing MoCHI

Installation

The easiest way to install MoCHI is by using the bioconda package:

conda install -c bioconda pymochi

See the full Installation Instructions for further details and alternative installation options.

Usage

You can run a standard MoCHI workflow using the command line tool or a custom analysis by taking advantage of the "pymochi" package in your own python script.

MoCHI requires a plain text model design file containing a table describing the measured phenotypes and how they relate to the underlying additive (biophysical) traits. The table should have the following 4 tab-separated columns (see example here):

  • trait: One or more additive trait names
  • transformation: The shape of the global epistatic trend (Linear/ReLU/SiLU/Sigmoid/SumOfSigmoids/TwoStateFractionFolded/ThreeStateFractionBound)
  • phenotype: A unique phenotype name e.g. Abundance, Binding or Kinase Activity
  • file: Path to DiMSum output (.RData) or plain text file with variant fitness and error estimates for the corresponding phenotype

Option A: MoCHI command line tool

Replace MY_MODEL with the path to your model design file (see example here).

run_mochi.py --model_design MY_MODEL

Get help with additional command line parameters:

run_mochi.py -h

Option B: Custom Python script

Below is an example of a custom MoCHI workflow (written in Python) to infer the underlying free energies of folding and binding from doubledeepPCA data.

#Imports
import pymochi
from pymochi.data import MochiData
from pymochi.models import MochiTask
from pymochi.report import MochiReport
import pandas as pd
from pathlib import Path

#####################
# Step 1: Create a *MochiTask* object with one-hot encoded variant sequences, interaction terms and 10 cross-validation groups
#####################

#Globals
k_folds = 10
abundance_path = str(Path(pymochi.__file__).parent / "data/fitness_abundance.txt") #MoCHI demo data
binding_path = str(Path(pymochi.__file__).parent / "data/fitness_binding.txt") #MoCHI demo data

#Define model
my_model_design = pd.DataFrame({
   'phenotype': ['Abundance', 'Binding'],
   'transformation': ['TwoStateFractionFolded', 'ThreeStateFractionBound'],
   'trait': [['Folding'], ['Folding', 'Binding']],
   'file': [abundance_path, binding_path]})

#Create Task
mochi_task = MochiTask(
   directory = 'my_task',
   data = MochiData(
      model_design = my_model_design,
      k_folds = k_folds))

#####################
# Step 2: Hyperparameter tuning and model fitting
#####################

#Perform grid search overy hyperparameters
mochi_task.grid_search() 

#Fit model using optimal hyperparameters
for i in range(k_folds):
   mochi_task.fit_best(fold = i+1)

#####################
# Step 3: Generate report, phenotype predictions, inferred additive trait summaries and save task
#####################

temperature_celcius = 30

mochi_report = MochiReport(
   task = mochi_task,
   RT = (273+temperature_celcius)*0.001987)

energies = mochi_task.get_additive_trait_weights(
   RT = (273+temperature_celcius)*0.001987)
 
mochi_task.save()

Report plots, predictions and additive trait summaries will be saved to the my_task/report, my_task/predictions and my_task/weights subfolders.

Demo MoCHI

Run the demo to ensure that you have a working MoCHI installation (expected run time <10min):

demo_mochi.py

Manual

Comprehensive documentation is coming soon, but in the meantime get more information about specific classes/methods in python e.g.

help(MochiData)

Bugs and feedback

You may submit a bug report here on GitHub as an issue or you could send an email to [email protected].

Citing MoCHI

Please cite the following publication if you use MoCHI:

Faure, A. J. & Lehner, B. MoCHI: neural networks to fit interpretable models and quantify energies, energetic couplings, epistasis and allostery from deep mutational scanning data. BioRxiv (2024). 10.1101/2024.01.21.575681

Acknowledgements

Project based on the Computational Molecular Science Python Cookiecutter version 1.6.

(Vector illustration credit: Vecteezy!)

mochi's People

Contributors

andrefaure avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.