Chesa254's Projects
API Scraping - Web Scraping Real Estate Data + PostgreSQL
First place solution
Benchmarking different approaches for categorical encoding for tabular data
The Data Engineering Cookbook
Designed an Unsupervised learning using Clustering K-Means Algorithms on a data related to spending scores of customers based on 5+ features. This has helped the client to target the customer according to their income and spending score .
Codes for case studies for the Bekes-Kezdi Data Analysis textbook
Here you will find some projects using Pynthon in Data Science context, including: Data Processing, Data Analysis, Numpy, Pandas, Data Visualization, Matplotlib, Seaborn, Machine Learning, Linear Regression, Logistic Regression, K Nearest Neighbors, Decision Trees, Random Forests, Vector Machines, K Means Clustering, Recommender Systems, Natural Language Processing, Spark with Python, Neural Nets and Deep Learning.
This is my first GitHub project for practise
Self-contained Data Science Project in Jupyter using Python and Linear Regression to Predict House Prices
Hypothesis testing is one of the most fundamental elements of inferential statistics. In modern languages like Python and R, these tests are easy to conduct — often with a single line of code. But it never fails to puzzle me how few people use them or understand how they work. In this repository I would like to use an example to show three common hypothesis tests and how they work under the hood, as well as showing how to run them in R and Python and to understand the results.
This Project describes a code written in R that takes in Car number plates and calculates how many cars have been bought in between two number plates.
A group of journalists reporting in the Nile Basin region has access to historic and projected rainfall and runoff data in the Lake Victoria Basin and seek to find out the trends in the data, past and future, summarize the findings and visualize the insights drawn from the dataset to tell a story of the climatic condition of the region.
Linear Regression Data Science Project and Machine Learning Bootcamp
PROBLEM STATEMENT: Visualize various market segments, and identification of indicator variables that influence sales patterns through time in Kenya.
MOBILE FINANCIAL SYSTEM SENT A DATA-SET TO TEST MY DATA SCIENCE AND ANALYTICS SKILLS TOWARDS A DATA SCIENCE ROLE AT THE COMPANY. THIS REPOSITORY OUTLINES THE PROJECT DETAILS AND OUTCOMES OF MY ANALYSIS PLUS AN END TO END REPORT IN THE READ ME .
In this week's independent project, you will be working as a data scientist working for an electric car-sharing service company. You have been tasked to process stations data to understand electric car usage over time by solving for the following research question; Research Question Identify the most popular hour of the day for picking up a shared electric car (Bluecar) in the city of Paris over the month of April 2018
This Learning Path provided me with the foundational knowledge of core cloud computing concepts and an understanding of Oracle Cloud Infrastructure Cloud Services. This path got me skilled in OCI Concepts. This Learning Path prepared me for the Oracle Cloud Infrastructure Foundations Associate Certification.
in this repository, there are my kaggle project on loan application prediction in python and python code on linear regression, random forest, k-means, svm, and some easy but happy code to make python coding skill more better.
Data Types Codes
Moriga School SQL DBMS class
Forecasting on Time Series data in R
Forecasting Stock Returns in R
Analysis of Uber Data from NYC Open Data website
A predictive model to help Uber drivers make more money
PYTHON IP
Overview In this week's independent project, you will be working as Data Scientist for MTN Cote d'Ivoire, a leading telecom company and you will be solving for the following research question. Currently MTN Cote d'Ivoire would like to upgrade its technology infrastructure for its mobile users in Ivory Coast. Studying the given dataset, how does MTN Cote d'Ivoire go about the upgrade of its infrastructure strategy within the given cities? Your final deliverable will be a Data Report which will comprise the following sections; Business Understanding Data Understanding Data Preparation Analysis Recommendation Evaluation You can use the CRISP-DM methodology to guide you while working on the Data Report. Your Data Report will also need to have an objective account, with insights majorly coming from the dataset. However, you can refer to external information for supporting information. Below are some questions that can get you started; Which ones were the most used city for the three days? Which cities were the most used during business and home hours? Most used city for the three days? etc. The telecom data provided for this project is only a sample ( i.e. for only three days). The data files that you will need for this project will be as follows: cells_geo_description.xlsx [Link] (Links to an external site.) cells_geo.csv [Link] (Links to an external site.) CDR_description.xlsx [Link] (Links to an external site.) CDR 20120507 [http://bit.ly/TelecomDataset1] (Links to an external site.) CDR 20120508 [http://bit.ly/TelecomDataset2] (Links to an external site.) CDR 20120509 [http://bit.ly/TelecomDataset3]
Usecase of whatsapp chat analysis