Coder Social home page Coder Social logo

mlforpublicpolicylab's Introduction

10718, 94889: Machine Learning for Public Policy Lab

Spring 2020: Tues & Thurs, 3:00-4:20pm, GHC 4307

This is a project-based course designed to provide students training and experience in solving real-world problems using machine learning, with a focus on problems from public policy and social good.

Through lectures, discussions, readings, and project assignments, students will learn about and experience building end-to-end machine learning systems, starting from project definition and scoping, through modeling, to field validation and turning their analysis into action. Through the course, students will develop skills in problem formulation, working with messy data, communicating about machine learning with non-technical stakeholders, model interpretability, understanding and mitigating algorithmic bias & disparities, and evaluating the impact of deployed models.

Students will be expected to know python, and have prior coursework in machine learning.

DRAFT SYLLABUS

People

Instructors

Rayid Ghani Kit Rodolfa

GHC 8023
Office Hours:
TBD

GHC 8018
Office Hours: TBD

Teaching Assistants

Sebastian Caldas Himil Sheth

Office Hours:
Tue 10-11am
Wed 10-11am
in GHC 8009

Office Hours:
Mon 2:00-3:00pm
Thu 4:30-5:30pm
GHC 8th Floor, by printers

Tentative Schedule

See the draft syllabus for much more detail as well, including information about group projects, grading, and helpful optional readings.

Week Dates          Holidays? Lecture/Discussion Topic                   Project Activity                   Goal                                     Required Readings                            Deliverable / Expected Output                  
1 Tu: Jan 14
Th: Jan 16
Tu: Intro/Overview + Project Overviews
Th: Scoping, Problem Definition, Balancing goals (equity, efficiency, effectiveness)
Intro/Overview Get familiar with the class, goals, and understand project choices Thursday:
Data Science Project Scoping Guide
Using Machine Learning to Assess the Risk of and Prevent Water Main Breaks
2 Tu: Jan 21
Th: Jan 23
Tu: Case Studies + Discussion
Th: Acquiring Data, Privacy, Record Linkage
Project Definition & Data Discovery Data Audit and Exploration

TA Sessions: SQL, Databases, github
Tuesday:
Fine-grained dengue forecasting using telephone triage services
Predictive Modeling for Public Health: Preventing Childhood Lead Poisoning
What Happens When an Algorithm Cuts Your Health Care
Beginning of week, team and project assignments
3 Tu: Jan 28
Th: Jan 30
Tu: Data Exploration
Th: Building ML Pipelines
Finalize Project Scope and Data Stories Tuesday:
• TBD reading on data exploration
Practical Statistics for Data Scientists, Chapter 1

Thursday:
Architecting a Machine Learning Pipeline
ETL of some dataset (census?)
Data exploration
Scope refinement
4 Tu: Feb 4
Th: Feb 6
Analytical Formulation / Baselines Initial Data Science Pipeline Setup and Mockups
(problem formulation and validation process)
Tuesday:
Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations
Always Start with a Stupid Model, No Exceptions
First week of deep dives
Project Scope + Proposal with Descriptive Statistics
5 Tu: Feb 11
Th: Feb 13
Feature Engineering / Imputation Code Pipeline Development Iteration 1 - Build End to End Code Pipeline
(Focus on end-to-end shell)
Tuesday:
• TBD Feature Development Case Study
Missing Data Conundrum
Skeleton Code (Pipeline), Mockups
Proposal Peer Reviews
6 Tu: Feb 18
Th: Feb 20
Performance Metrics / Evaluation Pt. I (splits, metrics) Tuesday:
Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure
The Secrets of Machine Learning
Technical Modeling Plan (features, label definition(s), model specifications, etc)
7 Tu: Feb 25
Th: Feb 27
(Feb 24 drop deadline) Performance Metrics / Evaluation Pt. II (audition) Iteration 2 - End to End Code Pipeline
(Focus on feature development)
Tuesday:
Evaluating and Comparing Classifiers
Transductive Optimization of Top k Precision
Code (Pipeline), Initial Models (and analysis)
8 Tu: Mar 3
Th: Mar 5
Overfitting, Leakage, Issues in Deployment Tuesday:
Three Pitfalls to Avoid in Machine Learning
Leakage in Data Mining
Why is Machine Learning Deployment Hard?
Early Results: Correct but Crappy
9 Tu: Mar 17
Th: Mar 19
(prev wk spring brk) Model Interpretability Pt. I: global + postmodeling Iteration 3 - End to End Code Pipeline
(Focus on evaluation, results and intial front-end demo)
Tuesday:
• Interpretable Classification Models for Recidivism Prediction
Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission
Refined Feature List
10 Tu: Mar 24
Th: Mar 26
Model Interpretability Pt. II: local Tuesday:
Why Should I Trust You? Explaining the Predictions of any Classifier
Model Agnostic Supervised Local Explanations
Explainable machine-learning predictions for the prevention of hypoxaemia during surgery
Model Interpretation
11 Tu: Mar 31
Th: Apr 2
Bias and Fairness Pt I Tuesday:
Fairness Definitions Explained
A Theory of Justice, pages 1-19
Racial Equity in Algorithmic Criminal Justice [Focus on sections: I.B.2, all of section II, III introduction, III.B, and III.D.3]
Results (across models, features, metrics)
Add bias analysis methods
12 Tu: Apr 7
Th: Apr 9
Bias and Fairness Pt II Model selection, evaluation, balancing efficiency and equity Final model choice and understanding its performance and impact on disparities Tuesday:
A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions
Equality of Opportunity in Supervised Learning
Classification with fairness constraints: A meta-algorithm with provable guarantees
Draft Research Proposal Section
13 Tu: Apr 14
Th: Apr 16
Apr 16 Causality and Field Validation Tuesday:
The seven tools of causal inference, with reflections on machine learning
• TBD Field Trial Case Study
No deep dive - Thursday off
14 Tu: Apr 21
Th: Apr 23
Analysis to Action, Accountability and Transparency Communications & Transition Planning Project Report and Presentations
Field Trial Design
Tuesday:
Ethics and Data Science, entire book
• Communicating Data with Tableau, Chapter 1
Teaching Statistics: A Bag of Tricks, Chapter 11
Last week of deep dives
Draft Field Trial Design Section
15 Tu: Apr 28
Th: Apr 30
Final Presentations Presentations Presentation
16 May 7 (Finals Wk) Final Report Due Final Report Report and Repo and Code Documentation

mlforpublicpolicylab's People

Contributors

nanounanue avatar rayidghani avatar shaycrk avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.