Rochita Sundar's Projects
The aim is to develop an ML- based predictive classification model (logistic regression & decision trees) to predict which hotel booking is likely to be canceled. This is done by analysing different attributes of customer's booking details. Being able to predict accurately in advance if a booking is likely to be canceled will help formulate profitable policies for cancelations & refunds.
This project aims to build & optimise a book recommendation system based on collaborative filtering and will tackle an example of both memory based & model based approach (using KNNWithMeans & Singular Value Decomposition)
The aim is to find an optimal ML model (Decision Tree, Random Forest, Bagging or Boosting Classifiers with Hyper-parameter Tuning) to predict visa statuses for work visa applicants to US. This will help decrease the time spent processing applications (currently increasing at a rate of >9% annually) while formulating suitable profile of candidates more likely to have the visa certified.
This repository contains my code solution to DeepLearning.AIs Practical Data Science On AWS Cloud Specialization.
Code for the online course "Deployment of Machine Learning Models"
This repository contains the lab work for Coursera course on "Generative AI with Large Language Models".
This repository contains my code solutions to Udacity's coursework 'Intro to Deep Learning with PyTorch'.
The aim to decrease the maintenance cost of generators used in wind energy production machinery. This is achieved by building various classification models, accounting for class imbalance, and tuning on a user defined cost metric (function of true positives, false positives and false negatives predicted) & productionising the model using pipelines.
The objective is to build a ML-based solution (linear regression model) to develop a dynamic pricing strategy for used and refurbished smartphones, identifying factors that significantly influence it.
The data relates to several user actions or interests recorded on two variants of landing pages for an online news portal. The objective is to analyse these interests by performing statistical analyses to determine if one variant is more effective based on chosen metrics (A/B testing).
The project involves performing clustering analysis (K-Means, Hierarchical clustering, visualization post PCA) to segregate stocks based on similar characteristics or with minimum correlation. Having a diversified portfolio tends to yield higher returns and face lower risk by tempering potential losses when the market is down.
Storyboard published on Tableau Public: https://public.tableau.com/app/profile/rsundar/viz/CanadianSuperstoreDatasetVisualization/CanadianSuperstoreDataset
Streamlit Smile Detector App
Data consists of tweets scrapped using Twitter API. Objective is sentiment labelling using a lexicon approach, performing text pre-processing (such as language detection, tokenisation, normalisation, vectorisation), building pipelines for text classification models for sentiment analysis, followed by explainability of the final classifier
Scrapped tweets using twitter API (for keyword βNetflixβ) on an AWS EC2 instance, ingested data into S3 via kinesis firehose. Used Spark ML on databricks to build a pipeline for sentiment classification model and Athena & QuickSight to build a dashboard
This project aims to scrape the website of Vancouver Public Library using automation test software. The automated tool will scrape more than 70K+ records to gather information on the specific language collection, title, author, category, availability status and ratings of international language material to draw insights