Joint-effort with Henry Vu in applying feature engineering, data massaging, and rigorous analysis of useful variables to predict the survival of people from a test dataset using random forest algorithm.
We carefully went through and watched the videos of Dave Langer, as well as read and followed some instructions from Will Stanton, to learn about the importance of understanding your data, prior to using models that would predict with a "good" error percentage.
Videos watched and thoroughly discussed in-person are listed below:
https://www.youtube.com/watch?v=32o0DnuRjfg - Intro to Data Science with R - Data Analysis Part 1
https://www.youtube.com/watch?v=u6sahb7Hmog - Intro to Data Science with R - Data Analysis Part 2
https://www.youtube.com/watch?v=aMV_6LmCs4Q - Intro to Data Science with R - Data Analysis Part 3
https://www.youtube.com/watch?v=UHJH7w9q4Lc - Intro to Data Science with R - Exploratory Modeling 1
https://www.youtube.com/watch?v=84JSk36og34 - Intro to Data Science with R - Cross Validation
Links viewed and included in a portion of the code:
http://will-stanton.com/machine-learning-with-r-an-irresponsibly-fast-tutorial/
^- Machine Learning with R: An Irresponsibly Fast Tutorial