Coder Social home page Coder Social logo

Hi there 👋

Quick overview:

Business-oriented Data Scientist, experienced in the whole data funnel, from preprocessing to model creation and have been working with programming languages like Python and SQL for 4+ years as well as business-oriented tools like Tableau, PowerBI and Looker which have enabled me to excel in data-driven projects to gain actionable insights for companies. I hold a Mathematics & Statistics Degree at the University of Warwick, Data Mining & Statistics Programmes at Stanford, a Digital MBA at ISDI as well as a Machine Learning Bootcamp at Ironhack.

Job Data Scientist Experiences: Hipoo, Seedtag (see below)

Individual Projects: Else (see below)

Ricardo_Bravo_Portfolio

This is my portfolio of projects I have worked on as a Data Scientist. Click on the blue links to go to the actual repository of the individual projects

Euro

In this project I was given a dataset of Semifinals and Finals from 2002-2009 and the objective was to predict the leaderboard order of the 2010 final.

As the problem is a regression supervised learning type, I decided to predict the number of points instead of the actual position. This is because these models are not as accurate in predicting discrete values. Once I have the points, I can order them. Aside from betting data, data on gender and home/away country was affecting the final result the most.

2010 Final Winner in Model- Germany

2010 Final Winner Actual - Germany


image

Here are the main results of our midbootcamp project, which consists of the implementation of a regression model.

May we explain a little bit what this is about.

Scenario

We are working as analysts for a real estate company. Our company wants to build a machine learning model to predict the selling prices of houses based on a variety of features on which the value of the house is evaluated.

Objective

Our job is to build a model that will predict the price of a house based on features provided in the dataset. Senior management also wants to explore the characteristics of the houses using some business intelligence tools. One of those parameters includes understanding which factors are responsible for higher property value - $650K and above.

Expected Outcomes

Since this is a regression model, you can use linear regression for building a model. You are also encouraged to use other models in your project including KNN regressor, decision trees for regression.

1. Explore the data To explore the data, you can use the techniques that have been discussed in class. Some of them include using the describe method, checking null values, using Matplotlib and Seaborn for developing visualizations.

The data has many categorical and numerical variables. Explore the nature of data for these variables before you start with the data cleaning process and then data pre-processing (scaling numerical variables and encoding categorical variables).

2. Build a Model Use different models to compare the accuracies and find the model that best fits your data. You can use the measures of accuracies that have been discussed in class. Please note that while comparing different models, make sure you use the same measure of accuracy as a benchmark.

3. Visualize You will use Tableau to visually explore the data further.


Presentation Data Science Projects at Hipoo:

image

https://www.canva.com/design/DAFyciKV1Q4/gtJy8c3C7TmPkTFQGH9jhA/edit?utm_content=DAFyciKV1Q4&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton

Presentation Data Science Projects at Seedtag:

image

https://www.canva.com/design/DAFydy13aIA/dO-14oaOd5s-Bm71xlOoKQ/edit?utm_content=DAFydy13aIA&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton


Logo

For this project, we had the challenge of working with a dataset from the FIFA 19 football game. We used a dataset from this project brief.

Approach

We decided to create a fictional data analytics consulting firm, the Data Dribblers, who specialise in the football (or soccer) industry.

The problem that we investigated was that clubs are are not getting a Return On their Investment when buying players. Clubs use a variety of features to choose the players but, lack a data-based approach to choose players who are the best value for money.

Our hypothesis was that we could identify under-valued players by creating a ranking model using performance attributes.

Methodology

Using our database as a snapshot of player performance, we developed a model that predicts market value based on objective performance measures and compared it with their actual market value. In this way we can generate lists of undervalued players that are high performers.

methodo


This is my final Project of the Data Analysis Bootcamp at Ironhack where I am using Statistics about each player in the Top 500 to determine who will win a head to head match

ATP_Tour_logo svg

Canva Presentation:https://www.canva.com/design/DAFpqtA7zsg/3M_0Q-65LLObk-iVmXvC-w/edit

Tableau Public: https://public.tableau.com/app/profile/ricardo.bravo1853/viz/TennisFinalProjectFinal/BreakPointsSavedFirstServes?publish=yes

Model Folder: Find the actual Model with the dataset found from Webscraping the atp official website

SQL folder: Find the queries I used in SQL to extract useful information

Tableau: Find the Tableau file where I did basic EDA, and the first insights on the data

Web Scraping: The actual Web Scraping code of all the data I initially used, and APIs I used. The end Dataset is at the very end

Streamlit: The code is available to actually run the Streamlit App as I showed in the presentation. The code also has the model inside.

Hope you are ready to make some money



Statistics notebooks and other relevant knowledge

Random Forest: https://github.com/ricardobravo98/lab-random-forests-Ricardo/tree/master/files_for_lab

Handling Data Imbalance: https://github.com/ricardobravo98/lab-handling-data-imbalance-classification-Ricardo

Inferential Statistics: https://github.com/ricardobravo98/lab-inferential-statistics-Ricardo

Cross Validation: https://github.com/ricardobravo98/lab-cross-validation-Ricardo

Unsupervised Learning: https://github.com/ricardobravo98/lab-unsupervised-learning-intro-Ricardo

T-test P values: https://github.com/ricardobravo98/lab-t-tests-p-values-Ricardo

Web Scraping: https://github.com/ricardobravo98/lab-web-scraping-single-page-Ricardo

Ricardo Bravo's Projects

eurovision icon eurovision

This is a model I created to predict the Winner of the Eurovision Contest in 2010

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.