Coder Social home page Coder Social logo

springboardmay2022's Introduction

Springboard Bootcamp Projects

There are various projects that I worked on during my time at Springboard.

We worked on the following topics:

Some of my favorite projects from each section will be briefly described below. Each page will have a more detailed readme on each of the individual projects.

Data Science Pipeline

This folder is designed to contain general projects related to the data science pipeline such as exploring APIs, creating presentations, and the fundamental steps in data science.

The London Borough case study was my favorite as it was one of the first forays I made into data science. We analyzed data from the London boroughs and did basic analysis. Within this folder we also performed some API calls on the NASDAQ API and explored that data on a surface level.

One of my favorite things was the ability to use some data visualization to explore our data.

Data Pipeline Image

Regression Algorithms

This folder focuses on projects focused on regression analysis. Presently, we have explored linear and logistic regression. Of these two projects, I think the linear regression case study was my favorite.

Within the linear regression case study, we explored the Red Wine database and created multiple linear regression modlels. This gave me my first foray into exploring adjusting models and understanding the process of performing analysis. In each case study we went through the steps of loading, cleaning, visualizing the data, and tuning the models after testing them.

Linear Regression Image

Clustering Algorithms

In this folder, we look at the foundations of clustering algorithms such as Euclidean vs Manhattan distance, cosine similarity, and k-means algorithms. In cosine similarity, we learned to use it for comparing similarities between sentences using tf-idf. My favorite project would be the k-means project.

In the k-means project we went through the entire process of creating a k-means clustering algoritm. The case study described the fundamentals of how k-means works and ran through some methods for optimizing and measuring the effectiveness of the algorithm. We created scree plots to find optimal clusters and even did silhouette analysis as a means of measuring performance.

K-means silhoeutte plot

Decision Tree Algorithms

This folder focuses all on decision tree algorithms! From standard decision trees to gradient boosting and even ensemble methods like random forest. We go through the foundations of decision tree algorithms and their motivations.

My favorite of these was definitely the Random Forest classifier case study. This covered a more complicated ensemble method which really shows off the power of decision trees. The case study not only showed how to create a random forest model but also how to analyze its performance. We used confusion matrices to understand performance. It also showed ways to look at variable importance.

Confusion matrix

Time Series Analysis

Within this folder is a singular time series analysis study. We look at creating a sales forecast using existing data. This case study taught the basics of TSA when assessing the data and model. We consider seasonality, stationarity, and forecasting with various ARIMA models.

This case study needs a bit more further work but lays the groundwork for loosely understanding TSA.

TSA

Hyperparamater Tuning

An important part of any data science project requires understanding the hyper-parameters and parameters that impact a machine learning model. WIthin this folder, we look at projects which focus on hyperparamter tuning such as Grid Search and Bayesian Optimization.

Of these two, Bayesian Optimization was my personal favorite. While slightly more complicated to implement it offers a far more robust and thorough understanding of hyperparameter tuning than GridSearch.

Bayesian Optimization

SQL Project

One of the most vital skills for any data scientist is knowing how to use SQL. Within this SQL project, we perform SQL Queries in both a RDMS like MySQL but also how to create an engine within Python and perform SQL queries within Python.

In the SQL Project you will find not only a notebook with the SQL queries but also the SQL database files that were used to store the SQL queries. SQL

springboardmay2022's People

Contributors

lutimoth avatar

Stargazers

Emine Erdogan avatar Paul Priest avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.