Tuan Phan's Projects
Answers to 120 commonly asked data science interview questions.
This contains details of AB Testing data analysis. Source code, RMarkdown files and dataset
A curated list of awesome places to learn and/or practice algorithms.
Some of my cool projects during my free time. Just for fun!
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
Data science interview questions and answers
Natural Language Processing Project
Interview stuff for friends
I took the Data Science Specialization (full 10 courses in 1 year) provided by Coursera for Johns Hopkins University. It covers the concepts and tools you'll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, youβll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.
The Leek group guide to data sharing
Developing Data Products Course from the Johns Hopkins Data Science Lab
Config files for my GitHub profile.
Compilation of resources and insights that helped me on my journey to data scientist
Plotting Assignment 1 for Exploratory Data Analysis
Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data βtidyβ. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.
Coursera machine learning specialization coursework (python based, University of Washington).
Classification models on treatment therapies to improve health status of patients
Track the velocity of moving objects
Repository for Programming Assignment for Practical Machine Learning
Built a predictive model using the random forest algorithm in R to classify dehydration status based on physiological parameters. This is a research project on wearable sensors for physiological measurements that I participated in at Johns Hopkins Lab. We performed 300 clinical trials under different IRBs at Johns Hopkins Hospital.
Side projects during my free time, just for my learning purpose, so please forgive me for my typo or errors.
I built machine learning models to predict/classify countries based on the happiness score. This is my side project. Algorithms used: multiple linear regression, model selections (best subset selection, forward stepwise selection, LASSO), classification (KNN, decision tree).
A breakout board for the Texas Instruments FDC1004 capacitance to digital converter
Repository to store sample python programs for python learning
Python Data Science Handbook: full text in Jupyter Notebooks
RStudio Cheat Sheets
Developed a program to track the movement of an object inside a microchannel and calculate speed in real-time. The input is real-time videos. The algorithms used in MATLAB are background subtraction, moving average filter, linear regression, and real-time data plotting.