Coder Social home page Coder Social logo

wowml / pcons-fold Goto Github PK

View Code? Open in Web Editor NEW

This project forked from elofssonlab/pcons-fold

0.0 2.0 0.0 34.77 MB

Pipeline for protein folding using PconsC and Rosetta

License: MIT License

Python 17.33% TeX 73.32% Perl 5.19% Csound Score 4.16%

pcons-fold's Introduction

PconsFold

PconsFold pipeline

A pipeline for protein folding using predicted contacts from PconsC and a Rosetta folding protocol.

You find supplementary data, such as protein IDs, sequences, native and predicted structures, predicted contacts at the bottom of the release page.

Pipeline overview:

  1. Input: fasta file containing one protein sequence
  2. Prepare input for PconsC
  3. Contact prediction with PconsC
  4. Prepare input for Rosetta folding
  5. Rosetta folding
  6. Extract and relax structures with lowest Rosetta energy
  7. Output: the predicted contact map (also as a plot) and the top-ranked structural model(s) relaxed and non-relaxed

Dependencies:

MATLAB is needed to run plmDCA. However, if MATLAB is not available you can also use a compiled version of plmDCA. For the compiled version to run you need to provide a path to MCR.

How to run it:

Make sure all dependencies are working correctly and adjust the paths in localconfig.py.

To run the full pipeline use:

./pcons_fold.py [-c n_cores] [-n n_decoys] [-m n_models]
                [-f factor] [--norelax] [--nohoms] 
                hhblits_database jackhmmer_database sequence_file
  • Required:
    • hhblits_database and jackhmmer_database are paths to the databases used by HHblits and Jackhmmer
    • sequence_file is the path to the input protein sequence in FASTA format (only single sequences).
  • Optional:
    • n_cores specifies the number of cores to use during computation (default: number of available cores).
    • n_decoys specifies the number of decoy structures generated by Rosetta (default: 2000, see publication).
    • n_models is the number of top-ranked models being extracted and eventually relaxed in the end (default: 10).
    • factor determines the number of constraints used to fold the protein, which is: factor * length_of_the_input_sequence (default: 1.0).
    • norelax is a flag that supresses relaxation of the final models. This can be used to quickly extract structures in the end.
    • nohoms is a flag that ensures that homologous structures are excluded from fragment picking. This is only useful in test cases if the model quality needs to be evaluated with a known structure.

You can also run PconsC contact prediction independently with this command:

./pconsc/predict_all.py [-c cores] hhblits_database jackhmmer_database sequence_file

And then fold the protein according to given predicted contacts with the following commands:

./folding/rosetta/prepare_input.py [-f factor] [--nohoms] sequence_file contact_map 

./folding/rosetta/fold.py [-c n_cores] [-n n_decoys] sequence_file rosetta_constraintfile

./folding/rosetta/extract.py [-c n_cores] [-m n_models] [--norelax] number_of_extracted_structures

The first script generates the file (pconsc_output)-(factor).constraints which is then used by Rosetta in the next step with rosetta_constraintfile.

Citation

M Michel, S Hayat, MJ Skwark, C Sander, DS Marks and A Elofsson. PconsFold: Improved contact predictions improve protein models. Bioinformatics (2014). 30(17): i482-i488

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.