Coder Social home page Coder Social logo

recorkill / pocket2drug Goto Github PK

View Code? Open in Web Editor NEW

This project forked from shiwentao00/pocket2drug

0.0 0.0 0.0 81.46 MB

Pytorch implementation of Pocket2Drug: a generative deep learning model to predict binding drugs for ligand-binding sites.

License: MIT License

Python 100.00%

pocket2drug's Introduction

Pocket2Drug

Pocket2Drug is an encoder-decoder deep neural network that predicts binding drugs given protein binding sites (pockets). The pocket graphs are generated using Graphsite. The encoder is a graph neural network, and the decoder is a recurrent neural network. The SELFIES molecule representation is used as the tokenization scheme instead of SMILES. The pipeline of Pocket2Drug is illustrated below:

If you find Pocket2Drug helpful, please cite our paper in your work :)
Pocket2Drug: An encoder-decoder deep neural network for the target-based drug design
Wentao Shi, Manali Singha, Gopal Srivastava, Limeng Pu, J. Ramanujam, and Michal Brylinsky
Frontiers in Pharmacology: 587

Usage

Dependency installation

  1. Install Pytorch:
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
  1. Install Pytorch-geometric:
conda install pyg -c pyg
  1. Install BioPandas:
conda install biopandas -c conda-forge
  1. Install selfies:
pip install selfies
  1. Install Rdkit:
conda install rdkit -c conda-forge

Dataset

All the related data can be downloaded here. After extraction, there will be two folders:

  1. pocket-data: files that contain information of the pockets. We will use the .mol2 files.
  2. protein-data: files that contain information of the proteins. We wiil use the .pops and .profile files.

Train

The configurations for training can be updated in train.yaml. Set the pocket_dir to the path of pocket-data, then set pop_dir and profile_dir to the path of protein-data. Set the out_dir the folder where you want to save the output results. The other configurations are for hyper-parameter tuning and they are self-explanatory according to their names. The script train.py trains the model on a 90%-10% split of the dataset, and you can specify which fold is used for validation:

python train.py -val_fold 0

In addition, you can use a pretrained RNN to initialize the decoder, the pretrained model can be found here. The pretrained RNN is trained on the chembl dataset and can improve the performance of the model. I have wrote an exmaple for pretraining RNN here).

Sample molecules

After training, the trained model will be saved at out_dir, and we can use it to sample molecules for the pockets in the validation fold:

python sample.py -batch_size 1024 -num_batches 2 -pocket_dir path_to_dataset_folder -popsa_dir path_to_pops_folder -profile_dir path_to_profile_folder -result_dir path_to_training_output_folder -fold 0

Of course, the model can be used to sample molecules for the unseen pockets defined by user. Simply omit the -fold option, the code will run on the specified input directories:

python sample.py -batch_size 1024 -num_batches 2 -pocket_dir path_to_dataset_folder -popsa_dir path_to_pops_folder -profile_dir path_to_profile_folder -result_dir path_to_training_output_folder

pocket2drug's People

Contributors

shiwentao00 avatar recorkill avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.