Coder Social home page Coder Social logo

my_udacity_capstone's Introduction

Udacity Capstone Project

Title:

Machine Learning Algorithms for Football Prediction using statistics from Brazilian championship data

Abstract:

This article evaluated football/Soccer results (victory, draw, loss) prediction in Brazilian Football Championship using various machine learning models based on real-world data from the real matches. The models were tested recursively and average predictive results were compared. The results showed that logistic regression and support vectors machine yielded the best results, exhibiting superior average accuracy performance in comparison to others classifiers (KNN and Random Forest), with 49.77% accuracy (logistic Regression), almost 17% better than a randomly decision (benchmark) which had 33% of success chance. In addition, a ranking of the features' relative importance was made to orient the use of Data.

Motivation:

Apllying Machine Learning tecniques on Football results prediction, in order to understand the data and evaluated the difficulty level of prediction.

Libraries:

In this section all libraries will be detailed and in witch files they are.

  • Pandas: Treatment of DataBase. Used in Webscrapping and Treatment Files.
  • Numpy: Treatment of DataBase. Used in Treatment File.
  • Seaborn: Plot graphs and images. Used in Treatment File.
  • Beautifull Soup: Plot graphs and images. Used in Webscrapping File.
  • Matplotlib.pyplot: Plot graphs and images. Used in Treatment File.
  • Xlsxwriter: Write the Database extracted in excel file. Used in Treatment File.
  • Os: Easy access files inside a folder of the memory computer. Used in Webscrapping File.
  • Requests: Allows send HTTP/1.1 requests. Used in Webscrapping File.
  • Time: Show the running time of some algorithms. Used in Webscrapping File.

Summmary of Results

As it can be seen in the table bellow, the algorithm achieved almost 50% of accuracy, considering that the random prediction probability of success is 33% (Victory/Draw/Loss), it is a good signal, it shows that the algorithm was able to identify some patterns after all.

results

After all 1000 times the Logistic Regression had the best results, with one of the lowest standard deviation rates and the highest accuracy between all four models. SVM performed better than the others, but had the highest standard deviation, maybe showing that the model tends to vary more than other models.

Conclusions

This project proves that football prediction is still a very hard task, at least with only this variables, ans it still needs more variables to help on the prediction of the results. However we can see by this article that a machine learning algorithms can already "think" on which team bet an can still be more accurate than people that does not know about the games having almost 17% of advantage in the prediction when comparing to the probability of a randomly prediction.

For the future i suggest to investigate and find more variables that could be usefully, as injuries for example, or more details of the players of each team, maybe FIFA or PRO EOVLUTION GAME data could help to bring more information to inside of the DataBase. Another thing that could be done for the future is on predicting the number of goals for of each team, this is more complex because it depends of the results predicted and they must conciliate with it, for example: it couldn't be two goals for the home team and two goals for the way team if the result was predicted as victory of the home team. So, maybe, this article can be a source of inspiration to the creation of better and complex models in the future.

Files:

In this section all files will be described.

  • WebScrapping: Contains all the functions used to web scrapp the page of FBREF
  • Tratament: Contains all the treatment of the Database used.
  • UDACITY_CAPSTONE_POST: PDF post of Udacity Capstone project.
  • Brasileirao: Data Base from brazilia Championship futbool.
  • Results: PNG file with the results

References and Sources of Data:

  • https://fbref.com/ : Site where Webscrapping was done - Data from Brazilian Serie A Fotball.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.