Coder Social home page Coder Social logo

Hi there 👋

My name is Jeff Ott, and I am an Engineer turned Data Scientist. I recently graduated from USF's Masters in Data Science program. During this program, I've tackled many topics and projects. I will post what projects I can, but at the university's request, the code will not be readily available unless by specific request. I am interested in computer vision, deep learning, A/B testing, and using Data engineering to reinforce these interests.

ADAS COG SCORE PREDICTION

Description: In work, I did for the UCSF Brain Lab found here https://github.com/darenma/unet2021, we predicted ADAS Cog, a memory exam given to patients to diagnose their cognitive decline from 3D MRI volumes. This was done with dual semantic segmentations unet to predict both matter types and Desikan Killany's brain atlas parts. Using these data pipelines, I also predicted the Adas Cog score. Below is the results from one of the segmentation pipelines
Libraries Used: Pytorch, torchIO, Pytorch Lightning, Plotly, Sklearn

Parcellation Raw MRIs
ezgif com-gif-maker image

Bot Town

Description: Utilizing a GPT-2 Seq2Seq model, I was able to recreate different podcast characters from the series' critical role. I then added these bots to nodes that walked around and spoke to each other. Their paths can be seen below. The full repository is found here. https://github.com/Jeffotter/BotTown
Libraries Used: Pytorch, Pandas, NLTK, Matplotlib

Walking Characters Sample Conversation
bottown image

Data translation pipeline

Description: In this project we made some pyfiles to translate to different data types from the command line

image
Libraries Used: sys, untangle, xmltodict, json

Search Engine Implementation

Description: In this project we implimented a search engine with both linear search and hash table search then compared the differences between the two. We then created a local website on flask allow local users to access and use the engine

image
Libraries Used: Flask, doc2vec, Regex, Codecs, Numpy

TFIDF Document Summary

Description: In this project, we processed zipped XML data (44M uncompressed, 9164 files), removed the XML, and tokenized the remaining strings. We developed a workflow that would calculate TFIDF (Term Frequency. inverse document Frequency) for each document.
*Libraries Used: nltk, xml.etree.cElementTree, sklearn.feature_extraction.text, collection, zipfile, string

Recommendation of Articles

In this project, I was first introduced to word embeddings in the form of word2vec. I converted all the documents into embedding lists and found the centroids of each document. I then recommended documents based on Euclidean distance. We then use flask, gunicorn, and jninja to build a scaleable website hosted on EC2 on AWS.

image
*Libraries Used: flask, doc2vec,re,string,numpy, codecs

Tweet Sentiment Analysis

In this project, I learned how to mine Twitter data and perform sentiment analysis to find the user's average sentiment through a search. I then hosted this website on EC2 and let users search for the average sentiment on any public Twitter handle. This introduced me to website API and classifying sentiment from raw text.

image
*Libraries Used: flask, tweetie,colour,numpy, tweepy, vadarSentiment

Zillow Housing Prediction (Time Series)

In this project, we attempted to predict median housing prices in California using the unemployment and Mortgage rate as helper variables. We tried three different models on the data ETS, SARIMAX, and FB Prophet. We were able to get a moderately good prediction with an RMSE of $7720. The methods and results are displayed in the Zillow Housing Prediction PDF.

image
*Libraires Used: Pandas, Numpy, statsmodels, fbprophet, tqdm, sklearn, pmdarmia, matplotlib

Linear Models

I implemented OLS, L2 regularization, and logistic regression in this project. I created functions to normalize the data and compute the loss gradient w/ without regularization. I then utilized these functions to make LogisticRegression, Linear regression, and Ridgeregresison classes.
image
*Libraries Used: pandas, numpy

Naive Bayes

In this project, I built a multinomial Naive Bayes classifier to predict whether a movie review was positive or negative. I used Laplace smoothing to deal with missing words and vectorized operations to increase speed. I then used K_fold cross-validation class I coded to train the model and compare it against Sklearn. I was able to achieve an 80% accuracy with this model
image
*Libraries Used: sklearn, numpy, time, codecs, re

Decison Trees

I attempted to recreate Sklearn Decision trees using recursively constructed trees in this project. I implemented LeafNode, DecsionNodes, and Decision tree classes and split using Gini impurity for classification and MSE for regression. I then inherited these classes in my RegressionTree and ClassifierTree functions. I compared these with the Sklearn implementations and got a small margin of error.
image
*Libraries Used: numpy, scipy.stats, lolviz

Random Forest

Using my decision tree implementation from before. I was tasked with combining these trees with building a random forest. I built RandomForestRegressor and RandomForestClassifier classes. I had to implement bootstrapping, subsampling, Out-of-bag error estimation, and random forest prediction to get comparable accuracy to Sklearn.
image
*Libraries Used: numpy, scipy.stats, lolviz

Jeff. O's Projects

bootstrap-calendar icon bootstrap-calendar

Full view calendar with year, month, week and day views based on templates with Twitter Bootstrap.

bottown icon bottown

Just another fantasy town with some interesting things to say

msds610 icon msds610

This is code for our stream and I/O project

msds621 icon msds621

Course notes for MSDS621 at Univ of San Francisco, introduction to machine learning

msds689 icon msds689

Course syllabus, notes, projects for USF's MSDS689

stanford_alpaca icon stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

usf icon usf

These are my project available to share from my time at USF

xtreg2way icon xtreg2way

An Algorithm to Estimate the Two-Way Fixed Effects Model

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.