Coder Social home page Coder Social logo

jvkersch / tmtools Goto Github PK

View Code? Open in Web Editor NEW
19.0 4.0 3.0 1.42 MB

Python bindings for the TM-align algorithm and code for protein structure comparison developed by Zhang et al.

License: GNU General Public License v3.0

Python 49.14% C++ 48.08% C 2.78%

tmtools's Introduction

TM-Tools

Python bindings for the TM-align algorithm and code developed by Zhang et al for protein structure comparison.

Installation

You can install the released version of the package directly from PyPI by running

    pip install tmtools

Pre-built wheels are available for Linux, macOS, and Windows, for Python 3.6 and up.

To build the package from scratch, e.g. because you want to contribute to it, clone this repository, and then from the root of the repository, run

    pip install -e . -v

This requires a C++ compiler to be installed with support for C++ 14.

Usage

The function tmtools.tm_align takes two NumPy arrays with coordinates for the residues (with shape (N, 3)) and two sequences of peptide codes, performs the alignment, and returns the optimal rotation matrix and translation, along with the TM score:

>>> import numpy as np
>>> from tmtools import tm_align
>>>
>>> coords1 = np.array(
...     [[1.2, 3.4, 1.5],
...      [4.0, 2.8, 3.7],
...      [1.2, 4.2, 4.3],
...      [0.0, 1.0, 2.0]])
>>> coords2 = np.array(
...     [[2.3, 7.4, 1.5],
...      [4.0, 2.9, -1.7],
...      [1.2, 4.2, 4.3]])
>>>
>>> seq1 = "AYLP"
>>> seq2 = "ARN"
>>>
>>> res = tm_align(coords1, coords2, seq1, seq2)
>>> res.t
array([ 2.94676159,  5.55265245, -1.75151383])
>>> res.u
array([[ 0.40393231,  0.04161396, -0.91384187],
       [-0.59535733,  0.77040999, -0.22807475],
       [ 0.69454181,  0.63618922,  0.33596866]])
>>> res.tm_norm_chain1
0.3105833326322145
>>> res.tm_norm_chain2
0.414111110176286

If you already have some PDB files, you can use the functions from tmalign.io to retrieve the coordinate and sequence data. These functions rely on BioPython, which is not installed by default to keep dependencies lightweight. To use them, you have to install BioPython first (pip install biopython). Then run:

>>> from tmtools.io import get_structure, get_residue_data
>>> from tmtools.testing import get_pdb_path
>>> s = get_structure(get_pdb_path("2gtl"))
>>> s
<Structure id=2gtl>
>>> chain = next(s.get_chains())
>>> coords, seq = get_residue_data(chain)
>>> seq
'DCCSYEDRREIRHIWDDVWSSSFTDRRVAIVRAVFDDLFKHYPTSKALFERVKIDEPESGEFKSHLVRVANGLKLLINLLDDTLVLQSHLGHLADQHIQRKGVTKEYFRGIGEAFARVLPQVLSCFNVDAWNRCFHRLVARIAKDLP'
>>> coords.shape
(147, 3)

Credits

This package arose out of a personal desire to better understand both the TM-score algorithm and the pybind11 library to interface with C++ code. At this point in time it contains no original research code.

If you use the package for research, you should cite the original TM-score papers:

  • Y. Zhang, J. Skolnick, Scoring function for automated assessment of protein structure template quality, Proteins, 57: 702-710 (2004).
  • J. Xu, Y. Zhang, How significant is a protein structure similarity with TM-score=0.5? Bioinformatics, 26, 889-895 (2010).

License

The original TM-align software (version 20210224, released under the MIT license) is bundled with this repository (src/extern/TMalign.cpp). Some small tweaks had to be made to compile the code on macOS and to embed it as a library. This modifications are also released under the MIT license.

The rest of the codebase is released under the GPL v3 license.

tmtools's People

Contributors

jvkersch avatar mbestipa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

tmtools's Issues

Update PyPI page

  • Update project README
  • Upload source distribution
  • Upload new wheels

Fix Windows build

The original TMalign.cpp code does not compile on Windows, but it does not seem difficult to make it do so.

US-Align Support?

Hello. Thank you for making this work open-source. I've definitely found it helpful. Out of curiosity, how much effort, in your experience, do you think it would take to adapt pybind here to wrap US-align instead of TM-align? I believe doing so may provide many new ways for users to use this package.

RMSD not shown tm_align function.

Would be great if along with rotation matrix, translation vector, and tm score, RMSD would be also an output of tm_align function

Issues with installing package with newer version of python

Hi there.
Thank you for this useful package! I see that 3.10 wheels have been built, however I cannot install them using pip:

pyenv local 3.10.8
python -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install tmtools

Gives the following error:

ERROR: Could not find a version that satisfies the requirement tmtools (from versions: none)
ERROR: No matching distribution found for tmtools

Would it be an idea to setup a dependabot/ GitHub action to automatically build the library for different python versions?

Which one is the target protein, and what is the TM-score?

Hi, thanks for providing this useful toolkit.

However, I have some confusion that needs your help. In your tutorial, tm_align accepts two inputs. But which one is the target and the other is the template (coord1 vs. coord2)? Besides, you output tm_norm_chain1 and tm_norm_chain2. Are they all TM-scores?

I am new to TM-score and the question above mentioned may sound naive.

Binding for residue-residue structural alignment

First of all, thanks for creating this package.

I was looking for a way to extract the "residue-residue" alignment that TM-align generates before it finds the optimal alignment for all-to-all residue comparisons across the entirety of the two input structures.

Basically what I would like to do is calculate the TM score for two structures, given that they are superimposed at a particular position (i.e. residue position x from structure 1 is overlayed to residue position y in structure 2).

The rationale behind this is to "impose" a specific known site that is analogous between two protein structures and get the alignment score (which would be optimised by rotating the structures while maintaining their translation such that residues x and y are fixed in the same position).

i.e. given that two structures are overlayed at structure_1, residue_x and structure_2, residue_y, what is the structure-structure TM score?

As far as I am aware, this sort of information is generated in the TM-align algorithm (the dynamic programming component) before the "best" residue-residue mapping is picked to optimise the score. Do you know if it would be easy to create a python binding to return this level of information, given two structures and a specific residue from each?

Or any other methods to get this "fixed residue pair" score?

Thanks in advance.

Test on Python 3.11

An earlier test run (https://github.com/jvkersch/tmtools/actions/runs/4802598119/jobs/8546253449?pr=11) showed some compilation errors on Python 3.11, as well as warnings regarding packaging:

 ********************************************************************************
          ############################
          # Package would be ignored #
          ############################
          Python recognizes 'tmtools.data' as an importable package[^1],
          but it is absent from setuptools' `packages` configuration.

          This leads to an ambiguous overall configuration. If you want to distribute this
          package, please make sure that 'tmtools.data' is explicitly added
          to the `packages` configuration field.

          Alternatively, you can also rely on setuptools' discovery methods
          (for example by using `find_namespace_packages(...)`/`find_namespace:`
          instead of `find_packages(...)`/`find:`).

          You can read more about "package discovery" on setuptools documentation page:

          - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

          If you don't want 'tmtools.data' to be distributed and are
          already explicitly excluding 'tmtools.data' via
          `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
          you can try to use `exclude_package_data`, or `include-package-data=False` in
          combination with a more fine grained `package-data` configuration.

          You can read more about "package data files" on setuptools documentation page:

          - https://setuptools.pypa.io/en/latest/userguide/datafiles.html


          [^1]: For Python, any directory (with suitable naming) can be imported,
                even if it does not contain any `.py` files.
                On the other hand, currently there is no concept of package data
                directory, all directories are treated like packages.
          ********************************************************************************

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.