Coder Social home page Coder Social logo

fpl-prediction's Introduction

FPL Prediction

This is a project to generate ongoing player forecasts for Fantasy Premier League. Credit to https://github.com/vaastav/Fantasy-Premier-League, from which I have taken the script to scrape data from the FPL site each week.

Data and utility functions are contained in the fpl-predictor module. The following data can be found in the data directory:

  • Player data for each gameweek since the start of the 2016/17 season (one folder with various csv files for each season, including fixtures)
  • teams.csv - Team names and global IDs, plus specific IDs for each season
  • train_v8.csv - The current training dataset containing all historic data
  • remaining_season.csv - A dataset with rows for each player's remaining fixtures in the current season, for use in predicting the remainder of the current season each week

I have written notebooks that go through the entire process taken to train, validate and select the forecast model. I am in the process of transferring these to Google Colab (links given) - these online notebooks are the most up to date in terms of model training (and should run without any issues, tell me on twitter if not), but are not yet as well commented as the original versions:

  • 00_fpl_features - Explore the training dataset (fields, data types, null values, etc.), write functions to generate window/lagging features (e.g. points per game for each player over the last 5 fixtures), and understand the approach to assessing the performance of models (validation)
  • 01_fpl_predict_baseline - Build a simple model to predict players for use as a baseline, and write a function to transform the training data into a format that we can easily use to perform validation
  • 02_fpl_predict_random_forest - Build a random forest model and validate its performance
  • 03_fpl_predict_xgboost - Build an XGBoost model, including parameter search, and validate its performance
  • 04_fpl_predict_fastai2_tabular - Build a neural network model with embeddings for categorical features and validate its performance
  • 05_fpl_predict_lstm - Build a sequence model with LSTMs in tensorflow and validate its performance

These models have been validated by looking at their performance each gameweek of the 2020/21 season. For each gameweek we fit the model using all historical data prior to that week, and then calculate the mean absolute error for the following 6 gameweeks. The performance of each model across the season is summarised in the following chart:

comparison chart

The LSTM model is the top performer currently, so this is the approach used to generate forecasts prior to each gameweek.

There are a further three jupyter notebooks:

  • initial_fpl_data_clean.ipynb - The original process to take the raw data and create training and prediction datasets
  • update_data_weekly.ipynb - The notebook previously run each week with the XGBoost model to take the raw data and create updated training and prediction datasets
  • fpl_predict_fastai_tabular.ipynb - The notebook run each week to train a model using all historical data and predict the remainder of the current season

And one supporting python script:

  • util.py - various functions used throughout, all of which are described in one of the above process notebooks.

Local setup:

I'd recommend using Colab, it's free and you don't need to worry much about setup. But if you want to run this locally or on your own cloud machine then for the non neural net notebooks I downloaded and installed anaconda and then set up an enivronment with jupyter, xgboost, pandas, matplotlib, requests, lxml and dtreeviz as follows:

conda create -n fplenv python=3.7
conda activate fplenv
conda install jupyter py-xgboost pandas matplotlib requests lxml
pip install dtreeviz

For neural nets (04_fpl_predict_fastai2_tabular.ipynb) I use fastai/PyTorch (installation instructions at https://docs.fast.ai/#Installing) and for sequence models I use tensorflow 2, but I recommend using a cloud instance with a GPU (e.g. AWS, GCP, Paperspace (has a fastai container), etc.).

fpl-prediction's People

Contributors

solpaul avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.