Coder Social home page Coder Social logo

tyang816 / deprot Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ginnm/deprot

3.0 0.0 0.0 78.04 MB

Code for DeProt. A protein language model with quantizied structure and disentangled attention

License: GNU General Public License v3.0

Python 89.58% Jupyter Notebook 10.42%

deprot's Introduction

DeProt

Code for DeProt. A protein language model with quantizied structure and disentangled attention

1 Install

git clone https://github.com/ginnm/DeProt.git
cd DeProt
pip install -r requirements.txt
export PYTHONPATH=$PYTHONPATH:$(pwd)

2 Structure quantizer

Structure quantizer

from deprot.structure.quantizer import PdbQuantizer
processor = PdbQuantizer(structure_vocab_size=2048) # can be 20, 128, 512, 1024, 2048, 4096
result = processor("example_data/p1.pdb", return_residue_seq=False)

Output:

[407, 998, 1841, 1421, 653, 450, 117, 822, ...]

3 DeProt models have been uploaded to huggingface ๐Ÿค— Transformers

from transformers import AutoModelForMaskedLM, AutoTokenizer
model = AutoModelForMaskedLM.from_pretrianed("AI4Protein/DeProt-2048", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("AI4Protein/DeProt-2048", trust_remote_code=True)

3.1 Available models

Model Description
AI4Protein/Deprot-20 structure vocab size = 20
AI4Protein/Deprot-128 structure vocab size = 128
AI4Protein/Deprot-512 structure vocab size = 512
AI4Protein/Deprot-1024 structure vocab size = 1024
AI4Protein/Deprot-2048 structure vocab size = 2048
AI4Protein/Deprot-4096 structure vocab size = 4096
AI4Protein/Deprot-2048-NO_AA2SS Ablative
AI4Protein/Deprot-2048-NO_SS2AA Ablative
AI4Protein/Deprot-2048-NO_AA2POS Ablative
AI4Protein/Deprot-2048-NO_POS2AA Ablative

4 Zero-shot mutant effect prediction

4.1 Example notebook

Zero-shot mutant effect prediction

4.2 Run ProteinGYM Benchmark

Download dataset from Google Driver. (This file contains quantized structures within ProteinGYM).

cd example_data
unzip proteingym_benchmark.zip
python zero_shot/proteingym_benchmark.py --model_path AI4Protein/DeProt-2048 \
--structure_dir example_data/structure_sequence/2048

5 Representation


6 Transfer-Learning


Citation

If you use DeProt in your research, please cite the following paper:

@article {Li2024.04.15.589672,
	author = {Mingchen Li and Yang Tan and Bozitao Zhong and Ziyi Zhou and Huiqun Yu and Xinzhu Ma and Wanli Ouyang and Liang Hong and Bingxin Zhou and Pan Tan},
	title = {DeProt: A protein language model with quantizied structure and disentangled attention},
	elocation-id = {2024.04.15.589672},
	year = {2024},
	doi = {10.1101/2024.04.15.589672},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2024/04/17/2024.04.15.589672},
	eprint = {https://www.biorxiv.org/content/early/2024/04/17/2024.04.15.589672.full.pdf},
	journal = {bioRxiv}
}

deprot's People

Contributors

ginnm avatar

Stargazers

Yang Tan avatar Zhuoqi Zheng avatar Bozitao Zhong avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.