Coder Social home page Coder Social logo

mqaprank's Introduction

MQAPRank

Protein Model Quality Assessment Program by Learning-to-Rank.

Overview

MQAPRank is a global protein model quality assessment program based on learning-to-rank. It firstly sorts the decoy models by using single method based on learning-to-rank algorithm to indicate their relative qualities for the target protein firstly. And then it takes the first five models as references to predict the qualities of other models by using average GDT_TS scores between reference models and other models. The MQAPRank has participated in the CASP12 under the group name FDUBio and achieved the state-of-the-art performances.

Requirements

The implementation was developed on CentOS 6.5(64bit), and therefore would execute properly on other Linux systems.

Platform:

  • Linux 64bit (CentOS 6.5 (64bit) is recommended)

Software (the following softwares (and their updated versions) are recommanded):

  • Linux Package
    • gcc-4.4.7-16.el6.x86_64
    • gcc-c++-4.4.7-16.el6.x86_64
    • kernel-devel-2.6.32-573.12.1.el6.x86_64
    • blas-devel-3.2.1-4.el6.x86_64
    • lapack-devel-3.2.1-4.el6.x86_64
    • glibc-2.12-1.166.el6_7.3.i686
  • Perl 5
  • Python 2.7.6
    • numpy 1.8.2
    • scipy 0.14.1
  • Modeller 9.14

Disk Space:

6.2G

Download and Install

The program is available at the following location:

https://gitlab.com/AndersJing/mqaprank.git

If the specific softwares in the 'Requirements' section have been installed (there is a common way of installation in the next section.), unpack the archive using the shell command:

$ tar -zxvf MQAPRank.tar.gz

Change directory to the resulting folder:

$ cd MQAPRank/

Run the provided installation script (requiring root privileges):

$ sudo ./install.py

This will install the required files to the specified location and set the related parameters.

A common way of installation

If you want to install the MQAPRank on a CentOS 6.5(64bit) system, follow the steps below:

  • Install gcc, gcc-c++, kernel-devel, blas-devel, lapack-devel and glibc.i686:

    $ yum install gcc gcc-c++ kernel-devel blas-devel lapack-devel glibc.i686

  • Install Python 2.7.6 (Download) :

    $ cd Python-2.7.6/

    $ ./configure

    $ make all

    $ make install

    $ make clean

    $ make distclean

    $ mv /usr/bin/python /usr/bin/python2.6.6

    $ ln -s /usr/local/bin/python2.7 /usr/bin/python

The yum is not compatible with Python 2.7, so we need to specify the Python version of the yum to Python 2.6 by changing the header of file '/usr/bin/yum' from '!/usr/bin/python' to '!/usr/bin/python2.6.6'.

  • Install numpy 1.8.2 (Download) :

    $ cd numpy-1.8.2/

    $ python setup.py install

  • Install scipy 0.14.1 (Download) :

    $ cd scipy-0.14.1/

    $ python setup.py install

  • Install modeller 9.14 (XXXXXXXXXXX is the Modeller license key) (Download) :

    $ env KEY_MODELLER=XXXXXXXXXXX rpm -Uvh modeller-9.14-1.x86_64.rpm

  • Copy related files to the Python 2.7.6 directory:

    $ cp -r /usr/lib64/python2.7/site-packages/* /usr/local/lib/python2.7/

  • Install MQAPRank:

    $cd MQAPRank/ $./install.py

How to Use

MQAPRank consists two protein model quality assessment strategies, which are the single method and the quasi-single method, and MQAPRank will output the predicted values of both methods. Also, it could assess models using two structure similarity metrics: 'GDTTS' or 'TMscore'.

The typical command to call MQAPRank is:

./MQAPRank.py modelList_input result_output GDTTS
  • MQAPRank.py: the call script for the MQAPRank.
  • modelList_input: the input file for the MQAPRank, containing the full path of the protein models to be evaluated (one line contains one model).
  • result_output: the output file for the MQAPRank, containing three items in every line: full path of the model, the predicted value of the corresponding model by the single method (MQAPRank) and the predicted value of the corresponding model by the quasi single method (Quasi-MQAPRank).
  • GDTTS: the structure similarity metric for the model, which could be 'GDTTS' or 'TMscore'.

Getting started: a test Example

You will find an example problem at the test/ folder.

The test/ folder contains a subdirectory models/ and a textfile modelList. There are 10 protein models to be evaluated in the subdirectory models/ and the corresponding full path of these protein models are included in the textfile modelList ($$PATH is the installation path of the MQAPRank):

1 $$PATH/MQAPRank/test/models/model_1.pdb
2 $$PATH/MQAPRank/test/models/model_2.pdb
3 $$PATH/MQAPRank/test/models/model_3.pdb
4 $$PATH/MQAPRank/test/models/model_4.pdb
5 $$PATH/MQAPRank/test/models/model_5.pdb
6 $$PATH/MQAPRank/test/models/model_6.pdb
7 $$PATH/MQAPRank/test/models/model_7.pdb
8 $$PATH/MQAPRank/test/models/model_8.pdb
9 $$PATH/MQAPRank/test/models/model_9.pdb
10 $$PATH/MQAPRank/test/models/model_10.pdb

To run the example, execute the command:

./MQAPRank.py test/modelList test/result GDTTS

The output in the result file would be as follows:

  1 Model Name  MQAPRank score  quasi-MQAPRank score
  2 $$PATH/MQAPRank/test/models/model_4.pdb    0.66    87.79
  3 $$PATH/MQAPRank/test/models/model_1.pdb    0.65    85.87
  4 $$PATH/MQAPRank/test/models/model_3.pdb    0.65    85.57
  5 $$PATH/MQAPRank/test/models/model_5.pdb    0.64    84.35
  6 $$PATH/MQAPRank/test/models/model_6.pdb    0.56    81.93
  7 $$PATH/MQAPRank/test/models/model_10.pdb   0.55    66.72
  8 $$PATH/MQAPRank/test/models/model_2.pdb    0.51    56.57
  9 $$PATH/MQAPRank/test/models/model_9.pdb    0.49    56.21
 10 $$PATH/MQAPRank/test/models/model_7.pdb    0.38    38.69
 11 $$PATH/MQAPRank/test/models/model_8.pdb    0.25    28.23

The first column contains the full paths of protein models, the second column contains the predicted values of the corresponding models by the single method (MQAPRank) and the third column contains the predicted values of the corresponding models by the quasi single method (Quasi-MQAPRank).

Disclaimer

This software is free only for non-commercial use. It must not be distributed without prior permission of the author. The author is not responsible for implications from the use of this software.

Known Problems

none

Reference

If you use MQAPRank in your scientific work, please cite as:

  • Xiaoyang Jing, Kai Wang, Ruqian Lu and Qiwen Dong. Sorting protein decoys by machine-learning-to-rank[J]. Scientific Reports, 2016, 6.
  • Xiaoyang Jing, Qiwen Dong. MQAPRank: improved global protein model quality assessment by learning-to-rank[J]. BMC bioinformatics, 2017, 18(1): 275.

Contact

xyjing14(at)fudan.edu.cn

mqaprank's People

Contributors

xyjing-works avatar andersjing avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.