AI4PH-R

Demo and assignment materials for the AI4PH course: Developing and Deploying Transparent and Reproducible Algorithms for Public Health.

Included files in this repository:

R markdown (.Rmd) files and R script (.R) files:

AI4PH_example.Rmd: The demo code that will be reviewed in class, which includes building and evaluating a logistic regression (lr) model on a stroke dataset using tidymodels, and calling the plumber script to generate an API. The students are recommended to run through this file themselves to make sure that their R environment is set-up correctly, all required packages have been installed successfully, and get familiar with the data, tidymodels, and plumber.

Read in: train_data.rds (included, the harmonized train data set)
Produce: stroke_lr_workflow.rds (not included, the trained workflow object including recipes and fitted lr model)

AI4PH_assignment.Rmd: The assignment file. In this assignment, the student will validate the stroke model we developed in class (AI4PH_example.Rmd) using a different dataset: valid_data.rds. You will run into issues using this dataset as it is because this is a raw dataset without data harmonization, which means that some variables in this dataset are different from the harmonized dataset we used to train and evaluate the model. Your job here is to harmonize the validation data so that it's in the same format as the example data we used in class (see line 79-83 in this file). You can refer to train_data_variables.csv to see the format in the harmonized train data.

Read in:
- valid_data.rds (included, the unharmonized validation set)
- stroke_lr_workflow.rds (not included, this is generated by runing AI4PH_example.Rmd)
Reference: train_data_variables.csv (metadata of the train set)
Produce: harmonized_valid_data.rds (the harmonized validation set)

stroke_lr_plumber.R: The plumber script, which will be used in both in class demo and assignment, no modification needed.

Read in: stroke_lr_workflow.rds (not included, this is generated by runing AI4PH_example.Rmd)

Data files: Please place the data files and the R markdown files in the same folder.

train_data.rds: the harmonized train data set, used in AI4PH_example.Rmd to train the model.
valid_data.rds: the un-harmonized validation data set, used in AI4PH_assignment.Rmd.

Metadata/ data dictionary:

train_data_variables.csv (data dictionary of the train set). You’ll also use this to help you harmonize the validation data in the assignment.

Assignment submission and office hour

To submit your work on the assignment, please send us the assignment file with your code and all the output, you can find our emails on Canvas. Please rename the file as AI4PH_assignment_YourName.Rmd. E.g. my name is Juan Li and I will rename my submission as AI4PH_assignment_JuanLi.Rmd.

If you have any question, you are encouraged to join the office hour on Feburary 19th.

m-d-grunnill / ai4ph-r Goto Github PK

ai4ph-r's Introduction

AI4PH-R

Included files in this repository:

R markdown (.Rmd) files and R script (.R) files:

Data files: Please place the data files and the R markdown files in the same folder.

Metadata/ data dictionary:

Assignment submission and office hour

ai4ph-r's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent