Description : This repository will consist baseline experiments for Multi-Task Molecular Property Prediction on Harvard's TDC dataset .
Easy installation via conda :
conda env create --file d4_mtp.yml --python=3.9
conda activate d4_mtp
A total of 6 tasks - Caco-2 , Lipophilicity , Solubility (AqSolDB) , PPBR , Acute Toxicity LD50 & Clearance (Hepatocyt) are catergorised under regression. To generate the datafiles ( train & test ) , run the following commands :
cd tdc_regression
bash reg_data_generation.sh
For training the network :
cd tdc_regression
bash train_reg_model.sh
A total of 4 tasks - Bioavailability,CYP P450 2D6 Inhibition,Ames Mutagenicity & hERG Blockers are catergorised under classification. To generate the datafiles ( train & test ) , run the following commands :
cd tdc_classification
bash clf_data_generation.sh
For training the network :
cd tdc_classification
bash train_clf_model.sh
A one-stop portal for observing current experiments :
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. Please make sure to update tests as appropriate.