This repository includes source code scripts for all the analysis and figures reported in the following:
Sub-dominant principal components inform new vaccine targets for HIV Gag (submitted)
Syed Faraz Ahmed, Ahmed A. Quadeer, David Morales-Jimenez and Matthew R. McKay.
The MATLAB source code for the analysis reported in the paper is contained in the “Analysis” folder. MATLAB data files are contained within the "Data" subfolder.
The data includes:
- Raw MSA
- Pre-processed Gag MSA matrix
- List of Gag sites related to known biochemical domains
- List of Gag sites related to HIV controllers and progressors
The Analysis folder also includes a code file "general_code.m" that can be used to infer sectors as described in the paper by provding any MSA and few parameters as inputs to the function.
The R scripts used for generating the accompanying figures are contained in the “Figures” folder.
- Dependencies
-
MATLAB (preferrably v2017a or later) installed with the following additional toolboxes:
- Bioinformatics Toolbox
- Statistics and Machine Learning Toolbox
-
RStudio with R (preferrably R version 3.5.1 or later) installed with the following packages:
tidyverse
ComplexHeatmap
RColorbrewer
multipanelfigure
circlize
scales
ggpubr
grid
magick
-
To re-run the analysis presented in the paper
- Open MATLAB and run the file
main.m
- Open MATLAB and run the file
-
To re-generate the figures presented in the paper
- Open RStudio and run the file
figures.rmd
- Open RStudio and run the file
For any queries related to code files, please contact [email protected]