Coder Social home page Coder Social logo

jkrusina / soccerpredictor Goto Github PK

View Code? Open in Web Editor NEW
85.0 11.0 24.0 2.41 MB

Predicting soccer matches outcomes with machine learning as time series.

CSS 6.01% Python 93.99%
soccer tensorflow keras machine-learning time-series forecasting betting python dash football

soccerpredictor's Introduction

SoccerPredictor

SoccerPredictor uses machine learning to predict outcomes of Premier League matches focusing on predicting win-or-draw or loss (corresponding to betting on double chance). The predictions are modeled as time series classification in an unconventional way. A neural network model is created for each team and trained simultaneously.

NOTE: Please, keep in mind that you will not be able to run the actual training, since I did not publish the dataset. If you would like to test it out, you would have to put together your own. Sites offering juicier features are trickier to scrape, but it is definitely possible.

So, take it mostly as an inspiration if you would like to build something similar or just to take a look how I implemented various things.

Without the dataset, only a visualization and backtesting on the attached set of files can be run. This serves mainly for demonstration purposes.

Please, refer to accompanying blog post to get more information about how the program works.

Results

Best result achieved was profit 1069 % with prediction accuracy around 90 % and ROI 33.4 % over the tested period. Timespan of the tested period was 113 days with betting on 32 out of 150 matches totally played.

Result

Installation

Running the program requires manual installation, e.g.:

$ git clone https://github.com/jkrusina/SoccerPredictor

# Create virtual environment and install required libraries
$ python3 -m venv SoccerPredictor
$ cd SoccerPredictor && source bin/activate
$ pip3 install -r requirements.txt

# To deactivate virtual environment
$ deactivate

Usage

The program is intended to be run in three different modes - training, visualization, and backtesting:

$ python3 main.py --help
usage: main.py [-h] {train,vis,backtest} ...

SoccerPredictor:

optional arguments:
  -h, --help            show this help message and exit

Modes to run:
  {train,vis,backtest}
    train               Trains model and makes predictions.
    vis                 Runs visualization of predictions.
    backtest            Runs backtesting on multiple folders.

Training

NOTE: You will not be to run training without the dataset as mentioned above.

$ python3 main.py train

Possible arguments to use:

$ python3 main.py train --help
usage: main.py train [-h] [--resume] [--epochs EPOCHS] [--ntest NTEST]
                     [--ndiscard NDISCARD] [--timesteps TIMESTEPS] [--predict]
                     [--lrpatience LRPATIENCE] [--lrdecay LRDECAY]
                     [--seed SEED] [--savefreq SAVEFREQ]
                     [--printfreq PRINTFREQ] [--verbose {0,1}] [--name NAME]

optional arguments:
  -h, --help            show this help message and exit
  --resume              Resumes training of previously saved model. Tries to
                        load the latest model saved if no name or prefix
                        specified via --name. (default: False)
  --epochs EPOCHS       Number of epochs to train model for. (default: 1)
  --ntest NTEST         Number of last samples used for testing for each team.
                        (default: 10)
  --ndiscard NDISCARD   Number of last samples to discard for each team.
                        (default: 0)
  --timesteps TIMESTEPS
                        Number of timesteps to use as data window size for
                        input to network. (default: 40)
  --predict             Whether to rerun predictions without any training.
                        (default: False)
  --lrpatience LRPATIENCE
                        How many epochs to tolerate before decaying learning
                        rate if no improvement. Turned off if 0. (default: 20)
  --lrdecay LRDECAY     How much to decay learning rate after patience
                        exceeded. (default: 0.95)
  --seed SEED           Specifies seed for rng. (default: None)
  --savefreq SAVEFREQ   How often (number of epochs) to save models. No
                        intermediate saving if 0. (default: 50)
  --printfreq PRINTFREQ
                        How often (number of epochs) to print current
                        summaries. No intermediate printing if 0. (default:
                        10)
  --verbose {0,1}       Level of verbosity. (default: 1)
  --name NAME           Tries to load the latest saved model with given name
                        prefix. Loads exact model if exact dir name specified.
                        Loads latest model if only a prefix of the name
                        specified. (default: None)

Examples:

# Training with fixed seed for 100 epochs
$ python3 main.py train --epochs 100 --seed 42

# Resuming training from latest directory
$ python3 main.py train --resume --epochs 100

# Training with discarding different amount of last samples, useful for backtesting
$ python3 main.py train --epochs 100 --ndiscard 1
$ python3 main.py train --epochs 100 --ndiscard 2
$ python3 main.py train --epochs 100 --ndiscard 3
$ python3 main.py train --epochs 100 --ndiscard 4

# Recreate predictions without training
$ python3 main.py train --predict

Visualization

A simple visualization of the trained model built with Dash.

$ python3 main.py vis

Then it is accessible via browser (by default):

http://127.0.0.1:8050/

Possible arguments to use:

$ python3 main.py vis --help
usage: main.py vis [-h] [--port PORT] [--host HOST] [--name NAME]
                   [--ignoreodds IGNOREODDS]

optional arguments:
  -h, --help            show this help message and exit
  --port PORT           Custom port for Dash visualization. (default: 8050)
  --host HOST           Custom host for Dash visualization. Can use 0 for
                        0.0.0.0 shortcut. (default: 127.0.0.1)
  --name NAME           Tries to load the latest saved model with given name
                        prefix. Loads exact model if exact dir name specified.
                        Loads latest model if only a prefix of the name
                        specified. (default: None)
  --ignoreodds IGNOREODDS
                        Ignores odds less than given amount when predicting
                        which team to bet on. (default: 1.1)

Examples:

# Exact name of directory can be specified
$ python3 main.py vis --name KIIY_2019-11-26T02-13-11_400

# Loads latest directory with given prefix
$ python3 main.py vis --name KIIY
# Change value of odds to ignore
$ python3 main.py vis --ignoreodds 1.01

Backtesting

Backtesting on trained models:

$ python3 main.py backtest

Possible arguments to use:

$ python3 main.py backtest --help
usage: main.py backtest [-h] [--path PATH] [--ignoreodds IGNOREODDS]

optional arguments:
  -h, --help            show this help message and exit
  --path PATH           Path to folder where multiple trained models are
                        saved. (default: ./data/models/)
  --ignoreodds IGNOREODDS
                        Ignores odds less than given amount when predicting
                        which team to bet on. (default: 1.1)

Examples:

# Change value of odds to ignore
$ python3 main.py backtest --ignoreodds 1.01

Requirements

Main requirements are:

  • Python 3.7
  • Keras 2.3.1
  • Tensorflow 1.14.0
  • Dash 1.4.1

Other required packages are specified in the requirements.txt file.

soccerpredictor's People

Contributors

jkrusina avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

soccerpredictor's Issues

Could you ate least provide a test data sample

Hi,

I'm very interested in testing your script, but I have no starting point. It would be very helpful if you provide a test data sample or at least specify how is the table format, columns, or the table contents.

Best,
Ricardo Faria

Dataset

Awesome stuff you got build up here. It truly is an inspiration. But about the dataset. Can you at least provide an example of how the dataset should look like?

ValueError: Value must be a nonnegative integer or None

Traceback (most recent call last):
  File "C:\Users\usr\Downloads\SoccerPredictor-master\main.py", line 21, in <module>
    pd.set_option("display.max_colwidth", -1)
  File "C:\Users\usr\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\_config\config.py", line 261, in __call__
    return self.__func__(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\usr\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\_config\config.py", line 160, in _set_option
    o.validator(v)
  File "C:\Users\usr\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\_config\config.py", line 882, in is_nonnegative_int
    raise ValueError(msg)
ValueError: Value must be a nonnegative integer or None

C:\Users\usr\Downloads\SoccerPredictor-master>python main.py -h
Traceback (most recent call last):
  File "C:\Users\usr\Downloads\SoccerPredictor-master\main.py", line 21, in <module>
    pd.set_option("display.max_colwidth", -1)
  File "C:\Users\usr\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\_config\config.py", line 261, in __call__
    return self.__func__(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\usr\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\_config\config.py", line 160, in _set_option
    o.validator(v)
  File "C:\Users\usr\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\_config\config.py", line 882, in is_nonnegative_int
    raise ValueError(msg)
ValueError: Value must be a nonnegative integer or None

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.