Coder Social home page Coder Social logo

deepab's Introduction

DeepAb

Official repository for DeepAb: Antibody structure prediction using interpretable deep learning. The code, data, and weights for this work are made available under the Rosetta-DL license as part of the Rosetta-DL bundle.

Try antibody structure prediction in Google Colab.

Setup

Optional: Create and activate a python virtual environment

python3 -m venv venv
source venv/bin/activate

Install project dependencies

pip install -r requirements.txt

Note: PyRosetta should be installed following the instructions here.

Download pretrained model weights

wget https://data.graylab.jhu.edu/ensemble_abresnet_v1.tar.gz
tar -xf ensemble_abresnet_v1.tar.gz

After unzipping, pre-trained models might need to be moved such that they have paths trained_models/ensemble_abresnet/rs*.pt

Common workflows

Additional options for all scripts are available by running with --help.

Note: This project is tested with Python 3.7.9

Note: Using --renumber option will send your antibody to the AbNum server. If working with confidential sequences you should avoid this option and use an external renumbering tool.

Structure prediction

Generate an antibody structure prediction from an Fv sequence with five decoys:

python predict.py data/sample_files/4h0h.fasta --decoys 5 --renumber

Generate a structure for a single heavy or light chain:

python predict.py data/sample_files/4h0h.fasta --decoys 5 --single_chain

Note: The fasta file should contain a single entry labeled "H" (even if the sequence is a light chain).

Expected output

After the script completes, the final prediction will be saved as pred.deepab.pdb. The numbered decoy structures will be stored in the decoys/ directory.

Attention annotation

Annotate an Fv structure with H3 attention:

python annotate_attention.py data/sample_files/4h0h.truncated.pdb --renumber --cdr_loop h3

Note: CDR loop residues are determined using Chothia definitions, so the input structure should be numbered beforehand or renumbered by passing --renumber

Expected output

After the script completes, the annotated PDB will overwrite the input file (unless --out_file is specificed). Annotations will be stored as b-factor information, and can be visualized in PyMOL or similar software.

Design scoring

Calculate ΔCCE for list of designed sequences:

python score_design.py data/sample_files/wt.fasta data/sample_files/h_mut_seqs.fasta data/sample_files/l_mut_seqs.fasta design_out.csv

Expected output

After the script completes, the designs and scores will be written to a CSV file with each row containing the design ID, heavy chain sequence, light chain sequence, and ΔCCE value.

References

[1] JA Ruffolo, J Sulam, and JJ Gray. "Antibody structure prediction using interpretable deep learning." Patterns (2022).

deepab's People

Contributors

everyday847 avatar jeffreyruffolo avatar jjgray avatar smoe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepab's Issues

Error in google colab

Hi! I try to analyse my sequence with google colab, but i have this error. Can you help me? Thank you

IndexError Traceback (most recent call last)
in
7 with open(fasta_file, "w") as f:
8 f.write(">:H\n{}\n>:L\n{}\n".format(heavy_sequence, light_sequence))
----> 9 cst_defs = get_cst_defs(model, fasta_file, device=device)
10
11 pred_pdb = build_structure(model,

1 frames
/content/deepab/deepab/constraints/write_constraints.py in get_constraint_residue_pairs(model, fasta_file, constraint_bin_value_dict, mask_distant_orientations, use_logits, device)
94
95 if constraint_type in pairwise_constraint_types:
---> 96 if preds[pred_i][i, j].argmax().item() >= len(
97 constraint_bin_value_dict[constraint_type]):
98 continue

IndexError: index 235 is out of bounds for dimension 0 with size 235

All decoys look the same

When I run the code

python predict.py data/sample_files/4h0h.fasta --decoys 5 --renumber

all the predicted decoys are identical, which I think isn't supposed to happen.
I get the following warning when running the code:

/home/venv/lib/python3.7/site-packages/torch/utils/checkpoint.py:25: UserWarning: None of the inputs have 
requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")

Is this the cause for identical decoys or is something else wrong?
I am running this on Ubuntu 18.04 with Python 3.7.9 and I also tried it on Ubuntu 20.04 with Python 3.8.10. I get the same warning and the same issue with decoys on both.

no decoys created

The message "Creating decoys structures" stays for too long and no decoy is created, nothing happens after this message. Do you have a clue what can be going on? Thank you!

Training

Can I train the model instead of using the pretrained one?

Error when running test data

python predict.py data/sample_files/4h0h.fasta --decoys 5 --renumber
**************************************************
Generating constraints
**************************************************
/opt/anaconda3/envs/trimmer_lab/lib/python3.8/site-packages/torch/utils/checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
100%|███████████████████████████████████████████████████████████████████████████| 230/230 [00:17<00:00, 13.17it/s]
**************************************************
Creating MDS structure
**************************************************
PyRosetta-4 2020 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python38.Release 2020.44+release.9f64dcffe297ee46f5259e47c90eadbf3b40d143 2020-10-27T11:42:54] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
**************************************************
Creating decoys structures
**************************************************
  0%|                                                                                       | 0/5 [00:00<?, ?it/s]PyRosetta-4 2020 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python38.Release 2020.44+release.9f64dcffe297ee46f5259e47c90eadbf3b40d143 2020-10-27T11:42:54] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
PyRosetta-4 2020 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python38.Release 2020.44+release.9f64dcffe297ee46f5259e47c90eadbf3b40d143 2020-10-27T11:42:54] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
PyRosetta-4 2020 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python38.Release 2020.44+release.9f64dcffe297ee46f5259e47c90eadbf3b40d143 2020-10-27T11:42:54] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
PyRosetta-4 2020 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python38.Release 2020.44+release.9f64dcffe297ee46f5259e47c90eadbf3b40d143 2020-10-27T11:42:54] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
PyRosetta-4 2020 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python38.Release 2020.44+release.9f64dcffe297ee46f5259e47c90eadbf3b40d143 2020-10-27T11:42:54] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.

ERROR: The x and/or y axis lines in the given spline are empty
x_axis: []
y_axis: []

ERROR:: Exit from: /Volumes/MacintoshHD3/benchmark/W.fujii.release/rosetta.Fujii.release/_commits_/main/source/src/numeric/interpolation/util.cc line: 51

issue when running the demo

After I have done the install, when I run the demo sequence, I got an issue like

Generating constraints


/xxx/bin/python/miniconda3/envs/deepab/lib/python3.7/site-packages/torch/utils/checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")


Creating MDS structure


ERROR: The x and/or y axis lines in the given spline are empty
x_axis: []
y_axis: []

ERROR:: Exit from: /scratch/benchmark/W.hojo-2/rosetta.Hojo-2/commits/main/source/src/numeric/interpolation/util.cc line: 51

concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/xxx/bin/python/miniconda3/envs/deepab/lib/python3.7/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/xxx/bin/python/miniconda3/envs/deepab/lib/python3.7/concurrent/futures/process.py", line 198, in process_chunk
return [fn(*args) for args in chunk]
File "/xxx/bin/python/miniconda3/envs/deepab/lib/python3.7/concurrent/futures/process.py", line 198, in
return [fn(*args) for args in chunk]
File "predict.py", line 29, in refine_fv

return refine_fv(in_pdb_file, out_pdb_file, cst_defs)
File "/xxx/ProteinFolding/DeepAb/deepab/build_fv/build_cen_fa.py", line 207, in refine_fv
pose)
RuntimeError:

File: /scratch/benchmark/W.hojo-2/rosetta.Hojo-2/commits/main/source/src/numeric/interpolation/util.cc:51
[ ERROR ] UtilityExitException
ERROR: The x and/or y axis lines in the given spline are empty
x_axis: []
y_axis: []

COuld you help to check what is the issue?

can't install at centos

With python=3.7 (as DeepAb ReadMe)
pip install -r requirements.txt
resulted in
Screen Shot 2022-02-26 at 7 56 57 PM

Even with python=3.8
pip install -r requirements.txt
resulted in the same result.

conda install -c pytorch pytorch=1.7.1
before 'pip install -r requirements.txt'
didn't help​.

Technical problems

Hi,
I am a rookie in antibody structure prediction, how should I get the exact Fv seq from fasta file of full-length?

Thanks

Reference vs Expected path?

Downloaded the tar wheel and ransudo python setup.py install and get the following?


>>> import pyrosetta
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/anaconda3/envs/trimmer_lab/lib/python3.8/site-packages/pyrosetta/__init__.py", line 15, in <module>
    import pyrosetta.rosetta as rosetta
ImportError: dlopen(/opt/anaconda3/envs/trimmer_lab/lib/python3.8/site-packages/pyrosetta/rosetta.so, 2): Symbol not found: __ZNKSt3__115basic_stringbufIcNS_11char_traitsIcEENS_9allocatorIcEEE3strEv
  Referenced from: /opt/anaconda3/envs/trimmer_lab/lib/python3.8/site-packages/pyrosetta/rosetta.so (which was built for Mac OS X 12.0)
  Expected in: /usr/lib/libc++.1.dylib


error while trying to predict the structure

i have a sequence of an antibody with a linker of GGGGSGGGGSGGGGS. i added divided the sequence into the heavy and light chains and tried to predict the structure however it produced this error in the step of Annotate predicted structure with output attention:

PDB must have a chain with chain id "[PBD ID]:H"
/usr/local/lib/python3.7/dist-packages/Bio/PDB/PDBParser.py:399: PDBConstructionWarning: Ignoring unrecognized record 'pdbpat' at line 1
PDBConstructionWarning,
An exception has occurred, use %tb to see the full traceback.

SystemExit: -1

also the .deepab.pdb file generated in the does not contain any atoms (it has pdbpatchnumbering: Unable to read patch file)

i am using the colab version

Internet requirement problem

I am a student which major in BCR sequencing, and I am trying to run the DaapAb on my HPC to predict some structure of some BCR, however, some errors exist, such as the following, After re-check the environment, I think the problem may be the requirement of response. Requirement files include some package such as beatuifulsoup and requests, which seems to be request for some information from an online website. However, our HPC is stored in the hospital and only login node is accessible, when I submit the files to run on other compute node to run multiple sequences in the same time, network is not available for these node. Is there any solution for network-free enviroment?
Thanks

new_pdb_data = response.text
UnboundLocalError: local variable 'response' referenced before assignment

AbResNet loss value

respected sir, what loss value did you obtained while trainning Abresnet and what was the batch size you used for it?

Getting 'IndexError: list index out of range' when running structure prediction

While trying to run the structure prediction step in Google Colab I get,

Traceback (most recent call last):
  File "/content/DeepAb/deepab/models/ModelEnsemble.py", line 21, in __init__
    self._num_out_bins = self.models[0]._num_out_bins
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "predict.py", line 190, in <module>
    _cli()
  File "predict.py", line 156, in _cli
    device=device)
  File "/content/DeepAb/deepab/models/ModelEnsemble.py", line 23, in __init__
    self._num_out_bins = self.models[0].num_out_bins
IndexError: list index out of range

What is going wrong?

Multiple mAb predictions in Google Colab

Hi!
I'm wondering if there's a way to predict ~100 mAb structures using the Google Colab for DeepAb? Without having to load them one by one and wait. Is it possible to have something like a batch input or a for-loop?

Kind regards,
Matthias

please, add --use_gpu recommendation to readme

When I first ran the program I got runtime error. After that I used --use_gpu flag and it went away. Maybe worth mentioning in docs

python predict.py data/sample_files/4h0h.fasta --decoys 5 --renumber
PyRosetta-4 2021 [Rosetta PyRosetta4.MinSizeRel.python38.ubuntu 2021.33+release.21c4761a87a1193dca5c6c2e1047681a200715d4 2021-08-14T17:47:22] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
**************************************************
Generating constraints
**************************************************

Traceback (most recent call last):
  File "predict.py", line 190, in <module>
    _cli()
  File "predict.py", line 166, in _cli
    cst_file = get_cst_file(model,
  File "/data/sources/DeepAb/deepab/build_fv/build_cen_fa.py", line 68, in get_cst_file
    residue_pairs = get_constraint_residue_pairs(model,
  File "/data/sources/DeepAb/deepab/constraints/write_constraints.py", line 74, in get_constraint_residue_pairs
    logits = get_logits_from_model(model, fasta_file, device=device)
  File "/data/sources/DeepAb/deepab/util/model_out.py", line 56, in get_logits_from_model
    out = model(seq)
  File "/opt/micromamba/envs/DeepAb/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data/sources/DeepAb/deepab/models/ModelEnsemble.py", line 32, in forward
    out = [model(x) for model in self.models]
  File "/data/sources/DeepAb/deepab/models/ModelEnsemble.py", line 32, in <listcomp>
    out = [model(x) for model in self.models]
  File "/opt/micromamba/envs/DeepAb/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data/sources/DeepAb/deepab/models/AbResNet/AbResNet.py", line 188, in forward
    lstm_enc = self.get_lstm_encoding(x)
  File "/data/sources/DeepAb/deepab/models/AbResNet/AbResNet.py", line 141, in get_lstm_encoding
    enc = self.lstm_model.encoder(src=lstm_input)[0].detach()
  File "/opt/micromamba/envs/DeepAb/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data/sources/DeepAb/deepab/models/PairedSeqLSTM/PairedSeqLSTM.py", line 25, in forward
    outputs, (hidden, cell) = self.rnn(src.float())
  File "/opt/micromamba/envs/DeepAb/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/micromamba/envs/DeepAb/lib/python3.8/site-packages/torch/nn/modules/rnn.py", line 581, in forward
    result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
RuntimeError: Input and parameter tensors are not at the same device, found input tensor at cuda:0 and parameter tensor at cpu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.