andreped / super-ml-pets Goto Github PK

🐢 AI for Super Auto Pets

License: MIT License

Python 100.00%

artificial-intelligence reinforcement-learning super-auto-pets game-development ai sb3 sapai sapai-gym machine-vision computer-vision

super-ml-pets's Introduction

super-ml-pets

Framework for training and deploying AIs for Super Auto Pets

Train AIs for Super Auto Pets through a simulated environment and test the trained model against real opponents in the actual game! AI is trained using reinforcement learning and a machine vision system is used to capture the screen to give information to the AI.

Introduction

Framework supports Python 3.7-3.11 and works cross-platform (Ubuntu, Windows, macOS). Deployment is also compatible with the web app.

Training has also been tested with GitHub Codespaces and Google Colab. A demonstration of model training can be seen in this gist.

We recommend using Windows for deployment as the UNIX-based systems require root permissions to launch the program out-of-the-box.

Getting started

Clone the repo:

git clone https://github.com/andreped/super-ml-pets.git

Setup virtual environment:

cd super-ml-pets/
virtualenv -ppython3 venv --clear
./venv/Scripts/activate

To activate the virtual environment on UNIX-based systems, instead of the last line run source venv/bin/activate

Install requirements:

pip install -r requirements.txt

Download all pets, food, and misc icons

wget https://github.com/andreped/super-ml-pets/releases/download/pets-01-2024/pets.zip -O pets.zip; Expand-Archive pets.zip -DestinationPath ./; Remove-Item pets.zip
wget https://github.com/andreped/super-ml-pets/releases/download/food-01-2024/food.zip -O food.zip; Expand-Archive food.zip -DestinationPath ./; Remove-Item food.zip
wget https://github.com/andreped/super-ml-pets/releases/download/misc-01-2024/misc.zip -O misc.zip; Expand-Archive misc.zip -DestinationPath ./; Remove-Item misc.zip

Additional setup for Ubuntu only

sudo apt install python3-tk
sudo su
source venv/bin/activate
xhost +
export DISPLAY=:0.0

Note that the command sudo su enables administrator rights. This seems to be required by keyboard as mentioned in issue #23. The xhost + DISPLAY stuff is needed as the screen might not be found, hence, initializing one solves this issue.

Usage

This framework currently supports training and deploying RL models for SAP.

Training

For training in simulated environment, using default arguments, simply run:

python main.py --task train

Given an existing model, it is also possible to finetune it by (with example):

python main.py --task train --finetune /path/to/model_sap_gym_sb3_180822_checkpoint_2175_steps

The script supports other arguments. To see what is possible, run:

python main.py --help

Testing

To use a trained model in battle, start the game Super Auto Pets.
Ensure that the game is in full screen mode, disable all unneccessary prompts, enable auto name picker, and set speed to 200% (you might also have to enable auto battle which can only be done in the first battle - if this is the first time you are playing this game).
Change the UI style to classic for all options in customize including "Food art", "Background art", "Menu background", "Buff style", and "Held food".
Change UI style for pets to classic by going to the pets settings.
Enter the arena by clicking Arena mode.
Go outside the game and download a pretrained model from here, or use any pretrained model you might have. For simplicity, you can also run the following to download a example model:

wget https://github.com/andreped/super-ml-pets/releases/download/v0.0.6/model_sap_gym_sb3_280822_finetuned_641057_steps.zip

Then, simply start the AI by running this command from the terminal (with example path to pretrained model, without extension .zip):

python main.py --task deploy --infer_model /path/to/model_sap_gym_sb3_280822_finetuned_641057_steps

Go back into the game and press the Space keyboard button (when you are in the Arena (in team preparation, before battle).

It might take a few seconds, but you should now be able to see the AI start playing. Please, let it play in peace, or else it might get angry and you may have accidentally created Skynet. If you accidentally exit the game, or dont have the game in fullscreen, the machine vision system will fail, and you will have to start a completely new game to use the AI (properly).

Training history

To plot training history, run:

python smp/plot_history.py --log /path/to/history/history_rl_model/progress.csv

Troubleshoot

To install virtualenv, run:

pip install virtualenv

If you do not have virtualenv in the path, you can access it by:

python -m virtualenv -ppython3 venv --clear

To activate virtual environment on UNIX-based systems (e.g., macOS or Ubuntu), run:

source venv/bin/activate

If you are using newer versions of Python (e.g., 3.10), you might have issues with installing and/or using numpy with the other dependencies. If that happens, try downgrading numpy by:

pip install numpy==1.23.2 --force-reinstall

On both Ubuntu and macOS, it might require sudo permissions to run deployment. This has to do with keyboard events not being able to be recognized without sudo rights. On Windows, administrative rights is not needed. For more information, see here.

On macOS, when you are downloading the models (.zip files) from Releases, they might be unzipped automatically. This is bad as the model extension is .zip. To fix this, disable the Open safe files after downloading in the Safari Preferences (see here for more information).

If deployment fails to start (no mouse movements or events), it may be because your screen resolution differ from the expected resolution. The current machine vision system expects the screen resolution to be 1920x1080. Please, adjust the resolution to this. This will be fixed in the future.

Acknowledgements

This implementation is based on multiple different projects. The core implementation is derived from GoldExplosion, which further was based upon the super auto pets engine sapai and RL training through sapai-gym.

All credit to jpdefrutos for designing the amazing header figure.

Citation

If you found this project relevant for your research, please, cite the following:

@software{andre_pedersen_2023_7834142,
  author       = {André Pedersen and Javier Pérez de Frutos and laughinggaschambers and GoldExplosion},
  title        = {andreped/super-ml-pets: v0.0.9},
  month        = apr,
  year         = 2023,
  publisher    = {Zenodo},
  version      = {v0.0.9},
  doi          = {10.5281/zenodo.7834142},
  url          = {https://doi.org/10.5281/zenodo.7834142}
}

super-ml-pets's People

Contributors

Stargazers

Watchers

Forkers

tno123 goldexplosion andstu sciarella davyn7 jpdefrutos laughinggaschambers callum-smith raflll teknogeek dtype ben234c abearinatrap

super-ml-pets's Issues

Question

What is the recommended setting for training, and how long have you been training the v6 one?

Action "buy_food_team" is not handled

Apparently there is an action called "buy_food_team", which I was not aware of. This is not handled by the current deployment script.

RuntimeWarning

I keep getting this runtime warning. Not sure if this is specific to me or everyone is getting this.

C:\Users\jivit\anaconda3\envs\super\lib\site-packages\skimage\metrics_structural_similarity.py:230: RuntimeWarning: invalid value encountered in divide
S = (A1 * A2) / D

The project still work after this warning.

macOS deployment requires administrator rights

Quite annoyingly, the keyboard library requires sudo rights in order to be used on macOS.

If this is not done, you will get this error message.

The alternatively is to run the script like so: sudo python deploy.py

Even still, now with sudo the program might fail to find the dependencies (observed on macOS).

Nonetheless, this is bad practice. We should not need sudo permissions to deploy the software. We need to find a way to enable keyboard events on macOS (all UNIX-based systems likely have the same issue, e.g., Ubuntu).

ERROR: Failed building wheel for gym: wheel.vendored.packaging._tokenizer.ParserSyntaxError: Expected end or semicolon (after version specifier)

While installing gym==0.21.0, this error is seen.
Its seems to be broken since 2-3 weeks.

14:12:26 Traceback (most recent call last):
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/requirements.py", line 35, in init
14:12:26 parsed = parse_requirement(requirement_string)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_parser.py", line 64, in parse_requirement
14:12:26 return _parse_requirement(Tokenizer(source, rules=DEFAULT_RULES))
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_parser.py", line 82, in _parse_requirement
14:12:26 url, specifier, marker = _parse_requirement_details(tokenizer)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_parser.py", line 126, in _parse_requirement_details
14:12:26 marker = _parse_requirement_marker(
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_parser.py", line 147, in _parse_requirement_marker
14:12:26 tokenizer.raise_syntax_error(
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_tokenizer.py", line 163, in raise_syntax_error
14:12:26 raise ParserSyntaxError(
14:12:26 wheel.vendored.packaging._tokenizer.ParserSyntaxError: Expected end or semicolon (after version specifier)
14:12:26 opencv-python>=3.
14:12:26 ~~~^
14:12:26
14:12:26 The above exception was the direct cause of the following exception:
14:12:26
14:12:26 Traceback (most recent call last):
14:12:26 File "", line 2, in
14:12:26 File "", line 34, in
14:12:26 File "/tmp/pip-wheel-34ur4kuy/gym_f74ca167b0dc4f1da53d4decc6d2d4f6/setup.py", line 39, in
14:12:26 setup(
14:12:26 File "/usr/local/lib/python3.8/site-packages/setuptools/init.py", line 153, in setup
14:12:26 return distutils.core.setup(**attrs)
14:12:26 File "/usr/local/lib/python3.8/distutils/core.py", line 148, in setup
14:12:26 dist.run_commands()
14:12:26 File "/usr/local/lib/python3.8/distutils/dist.py", line 966, in run_commands
14:12:26 self.run_command(cmd)
14:12:26 File "/usr/local/lib/python3.8/distutils/dist.py", line 985, in run_command
14:12:26 cmd_obj.run()
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 395, in run
14:12:26 self.egg2dist(self.egginfo_dir, distinfo_dir)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 534, in egg2dist
14:12:26 pkg_info = pkginfo_to_metadata(egginfo_path, pkginfo_path)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/metadata.py", line 160, in pkginfo_to_metadata
14:12:26 for key, value in generate_requirements({extra: reqs}):
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/metadata.py", line 138, in generate_requirements
14:12:26 for new_req in convert_requirements(depends):
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/metadata.py", line 103, in convert_requirements
14:12:26 parsed_requirement = Requirement(req)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/requirements.py", line 37, in init
14:12:26 raise InvalidRequirement(str(e)) from e
14:12:26 wheel.vendored.packaging.requirements.InvalidRequirement: Expected end or semicolon (after version specifier)
14:12:26 opencv-python>=3.
14:12:26 ~~~^
14:12:26 [end of output]
14:12:26
14:12:26 note: This error originates from a subprocess, and is likely not a problem with pip.
14:12:26 �[0m�[91m ERROR: Failed building wheel for gym
14:12:26 �[0m Running setup.py clean for gym
building_wheel_gym_error.log

Additional information of requirements needed

I keep running into this error when trying to install the framework.

Namespace(finetune=None, model_name='rl_model', nb_games=10000, nb_retries=1, nb_steps=10000, task='deploy')

Pausing...

Running...
Traceback (most recent call last):
File "main.py", line 36, in
run(path)
File "C:\Users\jivit\Documents\Python_Scripts\RL\super-ml-pets\src\game_interaction\agent.py", line 69,
in run
model = MaskablePPO.load(model_path)
File "C:\Users\jivit\anaconda3\envs\gpu2\lib\site-packages\stable_baselines3\common\base_class.py", line 675, in load
path, device=device, custom_objects=custom_objects, print_system_info=print_system_info
File "C:\Users\jivit\anaconda3\envs\gpu2\lib\site-packages\stable_baselines3\common\save_util.py", line
419, in load_from_zip_file
data = json_to_data(json_data, custom_objects=custom_objects)
File "C:\Users\jivit\anaconda3\envs\gpu2\lib\site-packages\stable_baselines3\common\save_util.py", line
164, in json_to_data
deserialized_object = cloudpickle.loads(base64_object)
AttributeError: Can't get attribute '_make_function' on <module 'cloudpickle.cloudpickle' from 'C:\Users\jivit\anaconda3\envs\gpu2\lib\site-packages\cloudpickle\cloudpickle.py'>

I think it is because I am using a different version of stable baselines 3. Can you include the version of sb3_contrib and python you are using?

Requirements need refactoring

Currently, requirements.txt contain too much noise and too specific versions, which also appear to be Windows-only (or at least some versions). Should relax version strictness.

Fixing this will enable the framework to work with all operating systems and python versions again.

Unhashable type numpy error

Hi!
I'm trying to run the script, but I'm getting the numpy error "TypeError: unhashable type: 'numpy.ndarray'". This is on latest version, tried in windows and ubuntu, same issue. Python 3.10 and 3.11. The script seems to see the screen, but fails to continue.
Any idea?
Thanks!

INFO main.py 56 Running...
INFO deploy_agent.py 94 INITIALIZATION [self.run]: Loading Model
INFO deploy_agent.py 97 INITIALIZATION [self.run]: Create SuperAutoPetsEnv Object
INFO deploy_agent.py 104 CV SYSTEM [self.run]: Detect the Pets and Food in the Shop Slots
INFO deploy_agent.py 106 CV SYSTEM [self.run]: Calls [image_detection.find_the_animals]
INFO deploy_agent.py 111 CV SYSTEM [self.run]: The detected Pets and Food in the Shop is : ['fish', 'horse', 'pig', 'honey']
INFO deploy_agent.py 112 GAME ENGINE [self.run]: Set Environment Shop = detected Pets and Food
INFO deploy_agent.py 121 GAME ENGINE [self.run]: Get the best action to make for the given state from the loaded model
Traceback (most recent call last):
  File "/home/pablo/super-ml-pets/main.py", line 63, in <module>
    main()
  File "/home/pablo/super-ml-pets/main.py", line 57, in main
    run(ret)
  File "/home/pablo/super-ml-pets/src/deploy_agent.py", line 128, in run
    " Current Team and Shop \n{}".format(s[action][0]))
TypeError: unhashable type: 'numpy.ndarray'

Add markdown table for supported setups

It is possible to have a markdown table that include which python versions and operating systems are supports (and what is compatible). Futhermore, one could syncronize this table directly to the CI tests. It would be very convenient for the user to see what currently works.

Team not updated when animals are moved around

As for the other actions, we need to add methods to handling certain events such as movement of animals.

See here for where to add event handling during deployment.

Add unit test for history plotter

We have a unit test to assess whether a simple training passes.

When it is done it would be great if we could check if we are able to read from the generated log file and plot the result.

We do not necessary care what it plots as long as the entire scripts runs without error.

Lag in the actual game cause the CV system to malfunction

Whenever the game lags, the CV system doesn't realize this and detects the Shop Slots again causing a disruption between the actual game and the engine. Proper detection of lags needed.

Make tailored figure for README

It would be great to have a more personal figure in the README to improve the first impression of the project.

Chrome extension to deploy AI in web app?

A cool idea I just had would be to be able to deploy a trained AI in the web app through a web extension, e.g., chrome.

Initially, I assumed that one would need to write lots of Javascript to do that, or at least setup some sort of microservice, but then I remembered that PyScript is a thing, and maybe can be used to resolve this issue.

Might be challenging with lots of dependencies, but I guess we could try.

New UI event causes bot the crash

This was observed after two games of playing, where the bot had lost the first game but won the second.

This happens right after the shop UI is shown after a game. Hence, the auto-clicker has stopped, and thus, the bot thinks it is looking at the shop.

A fix could be to click one second longer after the shop UI has shown, to automatically skip this prompt.

Another training method

This is probably impossible but it would be good if we had a way of training the ai by playing or train with video.

Swap prints with logging system

This is convenient to choose which verbose to have, when training and deploying models.

See this example for how it can be done.

Model history is not linked to model

Currently, when training, the training log will be stored in a specific place ./history/sb3_log/.

With how the logging works, if all models are linked to the same file, it will either 1) overwrite it or 2) append to it. It would be better if the history of different models are saved into independent files.

animal pack is outdated

The engine runs on an older version of SAP where for instance a fish was a 3-3 and not a 2-2. We need to update sapai to have the updated numbers, such that the trained model transfers well to the real game.

CI is outdated

The CI script was made for the old solution.

The unit test should involve training in a simulated environment, using different operating systems and python versions.

Swap problem

The bot mainly swap position of pets. It barely do anything else. You should add a limit to the amount of times the bot allowed to swap pet position so it actually does something else than swapping

Most functions and classes lack documentation

We should add descriptions on what the different functions and classes do.

This should be done in a standardized manner. Probably something like this, that PyCharm suggests:

def test_function(p1, p2, p3):
    """
    test_function does blah blah blah.

    :param p1: describe about parameter p1
    :param p2: describe about parameter p2
    :param p3: describe about parameter p3
    :return: describe what it returns
    """ 
    pass

The Sloth anomaly

If we take a look at the Pets under standard expansion, there is no sloth. But recently while playing in the standard pack the Pet Sloth appeared (It didn't occur to me to take a screenshot (ㅠ﹏ㅠ) ).
This anomaly may cause the Computer Vision System to crash. Proper exceptions needed to prevent this.

Ubuntu 18.x deprecation

As Ubuntu 18.x is deprecated in Github Actions, or at least there will not be any containers using it, it is probably a good idea to bump to Ubuntu 20.x, or use a separate docker image with Ubuntu 18.x on the fly in the CI.

Are sequential trainings properly random?

When doing some CI tests, I observed that for some setups training never stopped. It seemed to always reach the same successful state. Hence, when restarting the training, are we properly resetting the SAP environment from sapai-gym? Perhaps we need to set a seed?

Use PEP8 standard

It is probably a good idea to stick to a single standard when making methods and whatnot.

PEP8 is probably the way to go.

CIs are extremely slow for Python 3.11

I believe this is due to most dependencies do not have precompiled wheels and would need to build the dependencies.

This will likely be solved over time as more deps support Python 3.11 natively, but for now we are stuck with 15 mins of dependency installation (compared to 3 mins normally).

An example CI can be seen here.

Reduce project size

Right now the entire repo is about 327 MB, which is way higher than the recommended maximum size of 100 MB.

This also means that it takes very long to clone.

Paths are windows-only

The current solution have used "\" to split paths, which is only applicable for Windows. Should replace these with "/" and fix it such that it works cross-platform.

Training seem to crash occasionally

When training RL models using sapai-gym, different errors tend to occur.

I have tried to uses try-expect blocks, but the problem is that if this happens, training using standard baseline 3 crashes, and we will have to start all over again.

We should therefore either: 1) fix what is bugged in sapai/sapai-gym or 2) add a wrapper function that catches when this fails, and tries to generate a new one (if possible).

Dynamic support for different monitors

As different monitors may have different resolutions, this will affect the machine vision system which captures stuff on screen.

Currently, there exists hard-coded params for locations of objects on the screen, which will change if for instance a 4k monitor is used.

We should capture the resolution the game is one and scale the "hard-coded values" to make this more dynamic.

Model not improving

After training for a while, that is finetuning the current best model(s), we find that the performance plateaus to a maximum of around 20-25 wins-ratio reward.

We should experiment with different values for hyperparameters such as batch mode and learnin rate.

Also, sapai-gym does not currently support freezing. Hence, if we add that the model should likely improve, as freezing is an important aspect of the game, especially for scaling.

Underneath is a plot of the training history, which displays as a contineous training, but where some crashes (and restarts) have occured (which could explain some of the spikes and sudden drops.

Maybe not an issue but

Bot playing 1000% slower than a snail. Also can someone record the bot playing and send me the vid idk if thats my problem or something. Resolution 1920x1080. Fullscreen: yes

Add method to stop AI

When using the AI with the game during deployment, there is no simple way to stop running the AI without CTRL+C from the terminal. However, as the AI use move events, this might be challenging to do, especially at certain stages.

A good option is to listen for a specific keyboard event, e.g., ESCAPE key press, to stop the AI.

Not possible to freeze items

This stems from sapai-gym, which has not added support for it yet:
https://github.com/alexdriedger/sapai-gym/blob/master/sapai_gym/SuperAutoPetsEnv.py#L294

When this is added, we will also need to add support for it in the machine vision system.

Otter randomly giving stats not recorded

During deployment, when buying the otter, it will give +2 attack +1 defence to a random member in the team. This is not recorded in the team.

Bug running deployment on macOS

Just tested the framework on a macOS Monterey (12.3.1) using Python 3.7.9.
Training works just fine, but deployment crashes.
First time running can also be a bit slow, but that is likely due to macOS running some security checks.
Sudo is also required to run the software, likely due to keyboard library being used. The same was observed on Ubuntu.

Error prompt after running deployment:

Traceback (most recent call last):
  File "main.py", line 49, in <module>
    run(ret)
  File "/Users/X/workspace/super-ml-pets/src/deploy_agent.py", line 98, in run
    model = MaskablePPO.load(ret.infer_model, custom_objects=custom_objects)
  File "/Users/X/workspace/super-ml-pets/venv/lib/python3.7/site-packages/stable_baselines3/common/base_class.py", line 705, in load
    path, device=device, custom_objects=custom_objects, print_system_info=print_system_info
  File "/Users/X/workspace/super-ml-pets/venv/lib/python3.7/site-packages/stable_baselines3/common/save_util.py", line 435, in load_from_zip_file
    th_object = th.load(file_content, map_location=device)
  File "/Users/X/workspace/super-ml-pets/venv/lib/python3.7/site-packages/torch/serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/Users/X/workspace/super-ml-pets/venv/lib/python3.7/site-packages/torch/serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\x00'.

Remove redundant scripts

Currently, the source code is packed with methods and full scripts that are not used.

We should just remove most of these, as most of these are from the old solution which is quite different from the current framework.

We will also always have the commit history. Hence, we can find these old methods and scripts if necessary.

Support tensorboard logs

Currently, we are saving training history in a .csv file using the CSVLogger tool. However, it would be great to support saving the history as tensorboard logs, as it is quite commonly used by developers.

If working properly, we may also substitute it with the CSV-solution completely.

Pip install fails in GitHub codespaces

Just attempted to setup the project using GitHub codespaces, and during pip install -r requirements.txt, an error occurred:

Collecting sapai@ git+https://github.com/andreped/sapai.git@update-stats
  Cloning https://github.com/andreped/sapai.git (to revision update-stats) to /tmp/pip-install-eln35jls/sapai_2c0fd4b241034f06956f0a9abf03bed9
  Running command git clone --filter=blob:none --quiet https://github.com/andreped/sapai.git /tmp/pip-install-eln35jls/sapai_2c0fd4b241034f06956f0a9abf03bed9
  Running command git checkout -b update-stats --track origin/update-stats
  Switched to a new branch 'update-stats'
  branch 'update-stats' set up to track 'origin/update-stats'.
  Resolved https://github.com/andreped/sapai.git to commit bdc16215157a294194085be954d38c9a100a9b53
  Preparing metadata (setup.py) ... done
Collecting gym~=0.21.0
  Downloading gym-0.21.0.tar.gz (1.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 290.3 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [1 lines of output]
      error in gym setup command: 'extras_require' must be a dictionary whose values are strings or lists of strings containing valid project/version requirement specifiers.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.

Strange behaviour when team is full

When deploying the AI against real people, I observed that it tends to sell an item (maybe always the first?). When this happens, it seems to be able to move the first item (which has been sold) to another place, which of course is not possible. When this happens, the AI just starts doing strange things. Perhaps there is something wrong with the engine when animals are sold? Or the machine vision system at fails? Perhaps a longer pause is needed between each action?

Add unit tests for deployment

Right now we have added a test CI for verifying that training works as intended.

For deployment this is more challenging to do as we cannot directly play the game from a CI.

Hence, a suggestion could be to setup a simple virtual play session, where we have screenshots to enable the AI to interact.

Might be that it crashes anyways, but at least we can verify that some of the deployment functionalities works.

latest sapai not compatible with sapai-gym

As discussed in this thread, sapai-gym and sapai are no longer compatible:
alexdriedger/sapai-gym#11

The simplest way to use sapai-gym is to use an older version of sapai. However, when installing naively, sapai is installed from the master branch. Hence, installation will fail.

I will have a separate branch in forks of both framework which has the stable versions.

Flake8

Should use Flake8 for assessing code quality. It would also be nice to get a coverage badge in the README.

Not saving best model

The current checkpoint callback saves a model at a given step periodically. However, it is not necessarily the best model.

We should use the EvalCallback: https://stable-baselines3.readthedocs.io/en/master/guide/callbacks.html#evalcallback

Keyboard events require sudo on Ubuntu

Tested deployment on Ubuntu.
Also needed to install tkinter as an additional requirement.

ran it with

sudo su
source venv/bin/activate
xhost +
export DISPLAY=:0.0

Traceback (most recent call last):
File "main.py", line 37, in
pause()
File "/home/xxx/super-ml-pets/src/agent.py", line 22, in pause
if keyboard.read_key() == 'space':
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/init.py", line 935, in read_key
event = read_event(suppress)
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/init.py", line 924, in read_event
hooked = hook(queue.put, suppress=suppress)
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/init.py", line 461, in hook
append(callback)
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_generic.py", line 67, in add_handler
self.start_if_necessary()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_generic.py", line 35, in start_if_necessary
self.init()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/init.py", line 196, in init
_os_keyboard.init()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_nixkeyboard.py", line 113, in init
build_device()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_nixkeyboard.py", line 109, in build_device
ensure_root()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_nixcommon.py", line 174, in ensure_root
raise ImportError('You must be root to use this library on linux.')
ImportError: You must be root to use this library on linux.

Strange behaviour after continuing training after crash

After Exception happens, for whatever reason, the ep_raw_mean and ep_len_mean are much higher than usual. Are we properly reseting the environment before restarting the training? Or is there a more important issue? Perhaps the opponents are reset to their poorest state, meaning that after restarting we are playing against easier opponents?

Note that after a while it goes down to a similar level that it was before the crash.

This is the prompt that i got around the Exception:

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 72.6       |
|    ep_rew_mean          | 13.8       |
| time/                   |            |
|    fps                  | 105        |
|    iterations           | 300        |
|    time_elapsed         | 5833       |
|    total_timesteps      | 614400     |
| train/                  |            |
|    approx_kl            | 0.06014028 |
|    clip_fraction        | 0.185      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.407     |
|    explained_variance   | 0.6        |
|    learning_rate        | 0.0003     |
|    loss                 | 1.53       |
|    n_updates            | 48400      |
|    policy_gradient_loss | -0.0404    |
|    value_loss           | 9.12       |
----------------------------------------
Exception: get_idx < pet-hedgehog 10-1 status-honey-bee 2-1 > not found
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 92.7        |
|    ep_rew_mean          | 33.1        |
| time/                   |             |
|    fps                  | 123         |
|    iterations           | 1           |
|    time_elapsed         | 16          |
|    total_timesteps      | 2048        |
| train/                  |             |
|    approx_kl            | 0.043567862 |
|    clip_fraction        | 0.208       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.48       |
|    explained_variance   | 0.68        |
|    learning_rate        | 0.0003      |
|    loss                 | 5.09        |
|    n_updates            | 48410       |
|    policy_gradient_loss | -0.0425     |
|    value_loss           | 9.02        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 83.9        |
|    ep_rew_mean          | 24.1        |
| time/                   |             |
|    fps                  | 116         |
|    iterations           | 2           |
|    time_elapsed         | 35          |
|    total_timesteps      | 4096        |
| train/                  |             |
|    approx_kl            | 0.041116748 |
|    clip_fraction        | 0.179       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.366      |
|    explained_variance   | 0.631       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.58        |
|    n_updates            | 48420       |
|    policy_gradient_loss | -0.0446     |
|    value_loss           | 16.3        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 83          |
|    ep_rew_mean          | 21.3        |
| time/                   |             |
|    fps                  | 114         |
|    iterations           | 3           |
|    time_elapsed         | 53          |
|    total_timesteps      | 6144        |
| train/                  |             |
|    approx_kl            | 0.056923926 |
|    clip_fraction        | 0.187       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.404      |
|    explained_variance   | 0.585       |
|    learning_rate        | 0.0003      |
|    loss                 | 3.17        |
|    n_updates            | 48430       |
|    policy_gradient_loss | -0.043      |
|    value_loss           | 12.3        |
-----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 83.6       |
|    ep_rew_mean          | 21.4       |
| time/                   |            |
|    fps                  | 112        |
|    iterations           | 4          |
|    time_elapsed         | 72         |
|    total_timesteps      | 8192       |
| train/                  |            |
|    approx_kl            | 0.05702912 |
|    clip_fraction        | 0.185      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.42      |
|    explained_variance   | 0.476      |
|    learning_rate        | 0.0003     |
|    loss                 | 3.65       |
|    n_updates            | 48440      |
|    policy_gradient_loss | -0.0412    |
|    value_loss           | 14.4       |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 77.3      |
|    ep_rew_mean          | 16.2      |
| time/                   |           |
|    fps                  | 112       |
|    iterations           | 5         |
|    time_elapsed         | 91        |
|    total_timesteps      | 10240     |
| train/                  |           |
|    approx_kl            | 0.0575137 |
|    clip_fraction        | 0.185     |
|    clip_range           | 0.2       |
|    entropy_loss         | -0.396    |
|    explained_variance   | 0.469     |
|    learning_rate        | 0.0003    |
|    loss                 | 4.42      |
|    n_updates            | 48450     |
|    policy_gradient_loss | -0.0437   |
|    value_loss           | 13.3      |
---------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 78.6        |
|    ep_rew_mean          | 17.3        |
| time/                   |             |
|    fps                  | 112         |
|    iterations           | 6           |
|    time_elapsed         | 109         |
|    total_timesteps      | 12288       |
| train/                  |             |
|    approx_kl            | 0.077249065 |
|    clip_fraction        | 0.219       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.47       |
|    explained_variance   | 0.525       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.23        |
|    n_updates            | 48460       |
|    policy_gradient_loss | -0.0489     |
|    value_loss           | 9.37        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 78.3        |
|    ep_rew_mean          | 17.8        |
| time/                   |             |
|    fps                  | 112         |
|    iterations           | 7           |
|    time_elapsed         | 127         |
|    total_timesteps      | 14336       |
| train/                  |             |
|    approx_kl            | 0.048610996 |
|    clip_fraction        | 0.182       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.408      |
|    explained_variance   | 0.645       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.42        |
|    n_updates            | 48470       |
|    policy_gradient_loss | -0.041      |
|    value_loss           | 11.6        |
-----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 77.4       |
|    ep_rew_mean          | 17.6       |
| time/                   |            |
|    fps                  | 111        |
|    iterations           | 8          |
|    time_elapsed         | 147        |
|    total_timesteps      | 16384      |
| train/                  |            |
|    approx_kl            | 0.04007852 |
|    clip_fraction        | 0.202      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.448     |
|    explained_variance   | 0.687      |
|    learning_rate        | 0.0003     |
|    loss                 | 2.07       |
|    n_updates            | 48480      |
|    policy_gradient_loss | -0.0456    |
|    value_loss           | 12         |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 76.6        |
|    ep_rew_mean          | 17.3        |
| time/                   |             |
|    fps                  | 111         |
|    iterations           | 9           |
|    time_elapsed         | 165         |
|    total_timesteps      | 18432       |
| train/                  |             |
|    approx_kl            | 0.056452066 |
|    clip_fraction        | 0.186       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.419      |
|    explained_variance   | 0.539       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.39        |
|    n_updates            | 48490       |
|    policy_gradient_loss | -0.043      |
|    value_loss           | 17.7        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 75.7        |
|    ep_rew_mean          | 17.1        |
| time/                   |             |
|    fps                  | 110         |
|    iterations           | 10          |
|    time_elapsed         | 184         |
|    total_timesteps      | 20480       |
| train/                  |             |
|    approx_kl            | 0.043880884 |
|    clip_fraction        | 0.188       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.453      |
|    explained_variance   | 0.543       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.02        |
|    n_updates            | 48500       |
|    policy_gradient_loss | -0.0444     |
|    value_loss           | 14.9        |
-----------------------------------------

Multiple screens

The process fails to detect the correct monitor where the game is.
Got an error from the find_the_animals routine.

help

Traceback (most recent call last):
File "C:\Users\Admin\Downloads\GAMEBOT\super-ml-pets-0.0.6\main.py", line 49, in
run(ret)
File "C:\Users\Admin\Downloads\GAMEBOT\super-ml-pets-0.0.6\src\deploy_agent.py", line 106, in run
pets, _ = find_the_animals(
ValueError: not enough values to unpack (expected 2, got 0)

SAP's UI and animals have updated - machine vision system needs updating

Switching to classical style partly resolves this issue as mentioned in #52 (comment).

This uses the old style of the animals. However, the UI might still be slightly different.