Coder Social home page Coder Social logo

rxnpredict's Introduction

Instructions for Using rxnpredict

  1. Download and install the following programs:
  2. Add Anaconda as a PATH variable so that Python will execute the scripts within Sublime Text 3.
    • Navigate to “This PC” in File Explorer
    • Right click "This PC" → Properties → Advanced system settings → Environment Variables...
    • In User variables, click "Path" variable → Edit → New → type "path\to\Anaconda3" (no quotes – e.g., C:\Users\Derek\Anaconda3)
  3. Go to https://github.com/doylelab/rxnpredict. On right, click "Clone or Download" and then "Download Zip". This will download a local copy of the repository (folder) to your computer.
  4. All of the molecules whose properties will be used for modeling must first be drawn in Spartan:
    • Use the Spartan GUI to draw the molecules, saving them in the spartan_molecules folder (within the rxnpredict folder).
    • Be sure to label any shared atoms within a substrate class (ligand, base, etc.) with a "*". You can do so by right clicking an atom, then click "Properties". Change the label text at the bottom of the dialog box (e.g., *C1).
    • Save the molecules in both .spardir format (for future editing) and .spinput format (this is what the program uses).
  5. Modify the python scripts:
    • In setup.py, change the value of spartan_path (line 16 only) to the path of the Spartan14v114.exe file. Be sure to use \\ between folder names.
    • In main.py, describe the 2D layout of the plates you have run (line 10 onwards). Helpful syntax:
      • plate_name = Plate(x,y) where x is the number of rows and y is the number of columns.
      • plate_name.fillRow([list], 'substrate_class', 'molecule_name') where 'molecule_name' corresponds to the name of that molecule's .spinput file. Replace fillRow with fillColumn to populate columns instead. Note: If your plate design does not conform to one molecule per row/column, you can modify the Plate's dimensions accordingly. If necessary, an Nx1 plate can specify the components of each reaction individually.
      • After the plates are filled (all "cells" of the plates must be populated with the same kinds of substrate_class), insert the following lines (where the plate names match the ones you have created):
       setup.export_reactions([plate1,plate2,plate3])
       setup.export_for_pca([plate1,plate2,plate3])
    • Run main.py (ctrl + B in Sublime Text) to create R\output_table.csv. This table consists of one row per reaction and one column per descriptor per reaction component. For example, a 2000-reaction screen with 20 base descriptors, 50 ligand descriptors, and 30 additive descriptors would generate a table with 2000 rows and 100 columns in R\output_table.csv.
  6. Run the analysis_template.R file:
    • First, save yield data in Excel in a single column in a file named yields.csv in the rxnpredict folder (note that the file will need to be saved in .csv format). Reactions without yield data due to analytical or other issues should be coded as NA.
    • Open analysis_template.R in R Studio and modify the working directory to the location of the rxnpredict folder (line 9). [Note: Before running the R code on a Mac, change the file locations such that folders are separated by “/” instead of “\\”.] Run the code by pressing ctrl + A and then ctrl + enter, which will perform the following steps:
      • Loading and scaling the output data generated by the python script.
      • Merging the yield data from yields.csv onto the dataset.
      • Splitting the data into training (70%) and test (30%) sets.
      • Training a random forest model using the training set.
      • Plotting a calibration plot of the model using the test set.
      • Calculating R^2 and RMSE values for the model using the test set.
      • Generating a variable importance plot for the model.
    • In the R Studio console, the test set R^2 and RMSE values should be printed in black text. A calibration plot and variable importance plot should be located in the R\plots folder.

rxnpredict's People

Contributors

doylelab avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.