Hi there 👋

Quick overview:

Business-oriented Data Scientist, experienced in the whole data funnel, from preprocessing to model creation and have been working with programming languages like Python and SQL for 4+ years as well as business-oriented tools like Tableau, PowerBI and Looker which have enabled me to excel in data-driven projects to gain actionable insights for companies. I hold a Mathematics & Statistics Degree at the University of Warwick, Data Mining & Statistics Programmes at Stanford, a Digital MBA at ISDI as well as a Machine Learning Bootcamp at Ironhack.

Job Data Scientist Experiences: Hipoo, Seedtag (see below)

Individual Projects: Else (see below)

Ricardo_Bravo_Portfolio

This is my portfolio of projects I have worked on as a Data Scientist. Click on the blue links to go to the actual repository of the individual projects

Eurovision Winner Predictor

In this project I was given a dataset of Semifinals and Finals from 2002-2009 and the objective was to predict the leaderboard order of the 2010 final.

As the problem is a regression supervised learning type, I decided to predict the number of points instead of the actual position. This is because these models are not as accurate in predicting discrete values. Once I have the points, I can order them. Aside from betting data, data on gender and home/away country was affecting the final result the most.

2010 Final Winner in Model- Germany

2010 Final Winner Actual - Germany

Midbootcamp project Real Estate

Here are the main results of our midbootcamp project, which consists of the implementation of a regression model.

May we explain a little bit what this is about.

Scenario

We are working as analysts for a real estate company. Our company wants to build a machine learning model to predict the selling prices of houses based on a variety of features on which the value of the house is evaluated.

Objective

Our job is to build a model that will predict the price of a house based on features provided in the dataset. Senior management also wants to explore the characteristics of the houses using some business intelligence tools. One of those parameters includes understanding which factors are responsible for higher property value - $650K and above.

Expected Outcomes

Since this is a regression model, you can use linear regression for building a model. You are also encouraged to use other models in your project including KNN regressor, decision trees for regression.

1. Explore the data To explore the data, you can use the techniques that have been discussed in class. Some of them include using the describe method, checking null values, using Matplotlib and Seaborn for developing visualizations.

The data has many categorical and numerical variables. Explore the nature of data for these variables before you start with the data cleaning process and then data pre-processing (scaling numerical variables and encoding categorical variables).

2. Build a Model Use different models to compare the accuracies and find the model that best fits your data. You can use the measures of accuracies that have been discussed in class. Please note that while comparing different models, make sure you use the same measure of accuracy as a benchmark.

3. Visualize You will use Tableau to visually explore the data further.

Presentation Data Science Projects at Hipoo:

https://www.canva.com/design/DAFyciKV1Q4/gtJy8c3C7TmPkTFQGH9jhA/edit?utm_content=DAFyciKV1Q4&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton

Presentation Data Science Projects at Seedtag:

https://www.canva.com/design/DAFydy13aIA/dO-14oaOd5s-Bm71xlOoKQ/edit?utm_content=DAFydy13aIA&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton

FIFA project

For this project, we had the challenge of working with a dataset from the FIFA 19 football game. We used a dataset from this project brief.

Approach

We decided to create a fictional data analytics consulting firm, the Data Dribblers, who specialise in the football (or soccer) industry.

The problem that we investigated was that clubs are are not getting a Return On their Investment when buying players. Clubs use a variety of features to choose the players but, lack a data-based approach to choose players who are the best value for money.

Our hypothesis was that we could identify under-valued players by creating a ranking model using performance attributes.

Methodology

Using our database as a snapshot of player performance, we developed a model that predicts market value based on objective performance measures and compared it with their actual market value. In this way we can generate lists of undervalued players that are high performers.

THE MATCH PREDICTOR - RICARDO BRAVO

This is my final Project of the Data Analysis Bootcamp at Ironhack where I am using Statistics about each player in the Top 500 to determine who will win a head to head match