Coder Social home page Coder Social logo

ai4all-fragilefamilies's Introduction

Princeton AI4ALL Fragile Families Project 2018

Authors and collaborators: Agata Foryciarz, Desmond Zhong, Renato Pagliara Vasquez, Jonathan Lu, Kristin Catena, Prof. Matt Salganik, Prof. Barbara Engelhardt

Background

The Fragile Families & Child Wellbeing Study is following a cohort of nearly 5,000 children born in large U.S. cities between 1998 and 2000 (roughly three-quarters of whom were born to unmarried parents). We refer to unmarried parents and their children as “fragile families” to underscore that they are families and that they are at greater risk of breaking up and living in poverty than more traditional families.

The core Study was originally designed to primarily address four questions of great interest to researchers and policy makers: (1) What are the conditions and capabilities of unmarried parents, especially fathers?; (2) What is the nature of the relationships between unmarried parents?; (3) How do children born into these families fare?; and (4) How do policies and environmental conditions affect families and children?

The core Study consists of interviews with both mothers, fathers, and/or primary caregivers at birth and again when children are ages one, three, five, nine, and fifteen. The parent interviews collect information on attitudes, relationships, parenting behavior, demographic characteristics, health (mental and physical), economic and employment status, neighborhood characteristics, and program participation. Additionally, in-home assessments of children and their home environments were conducted at ages three, five, nine, and fifteen. The in-home interview collects information on children’s cognitive and emotional development, health, and home environment. Several collaborative studies provide additional information on parents’ medical, employment and incarceration histories, religion, child care and early childhood education.

Six waves of data are publicly available through the Office of Population Research data archive:

  • Wave 1 (Baseline)
  • Wave 2 (Year 1)
  • Wave 3 (Year 3)
  • Wave 4 (Year 9)
  • Wave 5 (Year 15)

Researchers have used this data to develop models that predict key attributes affecting disadvantaged children and to suggest new policies to improve child outcomes. In this project, you will use data collected as a part of the Fragile Families Challenge to uncover factors that influence young people’s academic performance, confidence and grit, and psychological well-being. You will generate scientific questions and perform data exploration, feature selection, and machine learning to evaluate your hypotheses. You will explore alternative explanations for your results and work closely with the project instructors to refine your hypotheses. You will also work together to design and discuss policy proposals based on your findings that would help provide services and programs to facilitate children’s success.

More information:

http://www.fragilefamilies.princeton.edu/

Fragile Families Documentation:

https://fragilefamilies.princeton.edu/documentation

Fragile Families Challenge: http://www.fragilefamilieschallenge.org/

Fragile Families Challenge Blog: http://www.fragilefamilieschallenge.org/blog-posts/

Fragile Families Challenge "getting started" video:

https://www.youtube.com/watch?v=HrYPtdXeSaM&feature=youtu.be

Fragile Families Challenge "getting started" slides:

https://github.com/fragilefamilieschallenge/slides/blob/master/ffchallenge_getting_started_cos424.pdf

Data

We use the Fragile Families Challenge data. The data has been collected since children's birth at 5 time points: birth, year 1, 3, 5, 9 and 15. Our challenge is to predict outcomes at age 15 based on the variables from earlier time points.

The data is split into three CSV (comma separated values) files:

  • background.csv contains all the data up to year 9.
  • train.csv and test.csv contain data for age 15. The training set consists of 12,000 variables across 3,200 families. The test set consists of 6 variables across ~2,000 families.

Our goal by the end of the 3 weeks is to predict some of the six variables in the testing data.

Data Download

We will provide the data to you through a USB stick. Note that although this data is anonymized, it is still highly sensitive and should only be used for research purposes. Please do not copy the data to any other device besides your AI4All computer and do not share your local copies with anyone outside of AI4All.

Metadata - variable description

Each variable in the data has a dictionary of features associated with it, such as: source (constructed/weight/id number/...), respondent (father/mother/teacher), umbrella category (parental relationship, health and health behavior,...) and others. You can view variables by their features here: http://metadata.fragilefamilies.princeton.edu/variables.

Setup:

  • Create a folder on your Desktop with the name "Fragile_Families".

  • Inside this folder, create two new folders with the names "ff_data" and "ff_notebooks".

  • Download the zip file containing the Jupyter notebook for the day from https://github.com/renapagli/AI4All-FragileFamilies/

  • Extract the zip file and save the .ipynb file in the "ff_notebooks" folder.

  • Open a terminal by clicking on the Spotlight Search (magnifying glass icon) on the upper right corver of the desktop, entering "Terminal", and pressing enter.

  • Point to the "Fragile_Families" folder cd Desktop/Fragile_Families

  • To run a Jupyter notebook:

    source ~/miniconda3/bin/activate

    jupyter notebook

ai4all-fragilefamilies's People

Contributors

renapagli avatar lujonathanh avatar

Stargazers

Agata Foryciarz avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.