Coder Social home page Coder Social logo

👋 Hi there, my name is Yun Choi

  • I am an aspiring data scientist who loves drawing interesting insights from data and communicating them by using analytical tools!
  • I graduated Northwestern University in 2021 with a Bachelor's Degree in Industrial Engineering with a Minor in Data Science
  • Experience coding in Python, R, and SQL
  • I’m currently expanding my knowledge and skills and practicing my skills in various ML algorithms, Neural Networks, Natural Language Processing, Sabermetrics, and creating interactive visualization tools with Shiny and Tableau
  • The repositories contain a variety of assignments and projects I've partaken in. I am constantly updating this page over the course of my learning

Projects

SQL Business Analytics on E-Commerce Database
This project takes a mock, custom-built e-commerce database and uses SQL queries to extract data and insights valuable for a potential e-commerce business, such as website and traffic performance, product-level sales performance, and how customers access and interact with the website. Based on the results of the queries, I conducted analysis to help the business understand various user-interaction trends with the website, as well as to make bidding recommendations for the growth and profitability of the mock business.
Languages/Packges Used: SQL, Python (pandas, matplotlib, seaborn)
Project | Full Code

K-Means Clustering Medicare Services in the Emergency Department
The emergency department is chaotic in nature and requires a lot of resources, and therefore is often expensive to manage. Therefore, understanding the pattern of spending in ED is crucial for healthcare entities and systems to efficiently operate. The goal of this analysis is to generate insights about spending patterns by Emergency Departments via segmenting lines of services with similar characteristics into different groups by using K-means clustering.
Languages/Packges Used: Python (pandas, seaborn, scikit-learn)
Project | Full Code

Building Classifiers of Specific Text Entities
I built text classifiers that parse through 730 articles on Business Insider and extracts all entities in which it recognizes as one of three types: CEOs, companies, and percentages. Based on provided label values for each of the three categories, a logistic regression model for each entity type was constructed to classify the words based the context of the sentence they are in. The CEO classifier, company classifier, and the percentage classifier models were run through a subset of entities from the corpus composing of all 730 articles.
Languages/Packges Used: Python (pandas, seaborn, matplotlib, scikit-learn, re, nltk, spacy)
Project | Full Code

Building a QA System
The following document describes a step by step process of building a QA (Question Answering) System that parses through Business Insider articles from 2013 and 2014 and is capable answering the following questions:

  1. Which companies went bankrupt in month X of year Y?
  2. What percentage of drop or increase in GDP is associated with X?
  3. Who is the CEO of company X?

By incorporating the scoring system of Elasticsearch, spaCy’s NER tagger, and other heuristic methods, the following QA system is successfully able to categorize queries and output relevant answers.
Project/Full Code

Tuning and Comparing Regression Models on Wildfires Dataset
The following walks through a step by step model tuning/model selection process on provided training and test datasets wildfires_train and wildfires_test. I built various candidate models using tree-based regression algorithms and linear regression on the training set. Candidate models were evaluated on a provided test set.
Project/Full Code

EDA of MLB Batters from 1985 to 2020
I conducted basic exploratory data analysis to explore patterns and correlations within an MLB batter's performance metrics dating from 1985. I also attempted to highlight important factors that contribute to a team reaching the postseason (playoffs). I plan to use some of the insights gained from the following rudimentary data exploration for other personal projects such as using machine learning to determine whether a team is undervaluing or overvaluing a player, understanding player performance regression, as well as predicting the number of team wins given historical stats of players in the team.
Project/Full Code

📫 [email protected]

Yun Choi's Projects

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.