andreped / super-ml-pets Goto Github PK

🐢 AI for Super Auto Pets

License: MIT License

Python 100.00%

artificial-intelligence reinforcement-learning super-auto-pets game-development ai sb3 sapai sapai-gym machine-vision computer-vision

super-ml-pets's Issues

Swap prints with logging system

This is convenient to choose which verbose to have, when training and deploying models.

See this example for how it can be done.

help

Traceback (most recent call last):
File "C:\Users\Admin\Downloads\GAMEBOT\super-ml-pets-0.0.6\main.py", line 49, in
run(ret)
File "C:\Users\Admin\Downloads\GAMEBOT\super-ml-pets-0.0.6\src\deploy_agent.py", line 106, in run
pets, _ = find_the_animals(
ValueError: not enough values to unpack (expected 2, got 0)

Model not improving

After training for a while, that is finetuning the current best model(s), we find that the performance plateaus to a maximum of around 20-25 wins-ratio reward.

We should experiment with different values for hyperparameters such as batch mode and learnin rate.

Also, sapai-gym does not currently support freezing. Hence, if we add that the model should likely improve, as freezing is an important aspect of the game, especially for scaling.

Underneath is a plot of the training history, which displays as a contineous training, but where some crashes (and restarts) have occured (which could explain some of the spikes and sudden drops.

Use PEP8 standard

It is probably a good idea to stick to a single standard when making methods and whatnot.

PEP8 is probably the way to go.

Otter randomly giving stats not recorded

During deployment, when buying the otter, it will give +2 attack +1 defence to a random member in the team. This is not recorded in the team.

Are sequential trainings properly random?

When doing some CI tests, I observed that for some setups training never stopped. It seemed to always reach the same successful state. Hence, when restarting the training, are we properly resetting the SAP environment from sapai-gym? Perhaps we need to set a seed?

Team not updated when animals are moved around

As for the other actions, we need to add methods to handling certain events such as movement of animals.

See here for where to add event handling during deployment.

CIs are extremely slow for Python 3.11

I believe this is due to most dependencies do not have precompiled wheels and would need to build the dependencies.

This will likely be solved over time as more deps support Python 3.11 natively, but for now we are stuck with 15 mins of dependency installation (compared to 3 mins normally).

An example CI can be seen here.

Multiple screens

The process fails to detect the correct monitor where the game is.
Got an error from the find_the_animals routine.

Dynamic support for different monitors

As different monitors may have different resolutions, this will affect the machine vision system which captures stuff on screen.

Currently, there exists hard-coded params for locations of objects on the screen, which will change if for instance a 4k monitor is used.

We should capture the resolution the game is one and scale the "hard-coded values" to make this more dynamic.

New UI event causes bot the crash

This was observed after two games of playing, where the bot had lost the first game but won the second.

This happens right after the shop UI is shown after a game. Hence, the auto-clicker has stopped, and thus, the bot thinks it is looking at the shop.

A fix could be to click one second longer after the shop UI has shown, to automatically skip this prompt.

Lag in the actual game cause the CV system to malfunction

Whenever the game lags, the CV system doesn't realize this and detects the Shop Slots again causing a disruption between the actual game and the engine. Proper detection of lags needed.

ERROR: Failed building wheel for gym: wheel.vendored.packaging._tokenizer.ParserSyntaxError: Expected end or semicolon (after version specifier)

While installing gym==0.21.0, this error is seen.
Its seems to be broken since 2-3 weeks.

14:12:26 Traceback (most recent call last):
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/requirements.py", line 35, in init
14:12:26 parsed = parse_requirement(requirement_string)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_parser.py", line 64, in parse_requirement
14:12:26 return _parse_requirement(Tokenizer(source, rules=DEFAULT_RULES))
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_parser.py", line 82, in _parse_requirement
14:12:26 url, specifier, marker = _parse_requirement_details(tokenizer)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_parser.py", line 126, in _parse_requirement_details
14:12:26 marker = _parse_requirement_marker(
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_parser.py", line 147, in _parse_requirement_marker
14:12:26 tokenizer.raise_syntax_error(
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/_tokenizer.py", line 163, in raise_syntax_error
14:12:26 raise ParserSyntaxError(
14:12:26 wheel.vendored.packaging._tokenizer.ParserSyntaxError: Expected end or semicolon (after version specifier)
14:12:26 opencv-python>=3.
14:12:26 ~~~^
14:12:26
14:12:26 The above exception was the direct cause of the following exception:
14:12:26
14:12:26 Traceback (most recent call last):
14:12:26 File "", line 2, in
14:12:26 File "", line 34, in
14:12:26 File "/tmp/pip-wheel-34ur4kuy/gym_f74ca167b0dc4f1da53d4decc6d2d4f6/setup.py", line 39, in
14:12:26 setup(
14:12:26 File "/usr/local/lib/python3.8/site-packages/setuptools/init.py", line 153, in setup
14:12:26 return distutils.core.setup(**attrs)
14:12:26 File "/usr/local/lib/python3.8/distutils/core.py", line 148, in setup
14:12:26 dist.run_commands()
14:12:26 File "/usr/local/lib/python3.8/distutils/dist.py", line 966, in run_commands
14:12:26 self.run_command(cmd)
14:12:26 File "/usr/local/lib/python3.8/distutils/dist.py", line 985, in run_command
14:12:26 cmd_obj.run()
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 395, in run
14:12:26 self.egg2dist(self.egginfo_dir, distinfo_dir)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 534, in egg2dist
14:12:26 pkg_info = pkginfo_to_metadata(egginfo_path, pkginfo_path)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/metadata.py", line 160, in pkginfo_to_metadata
14:12:26 for key, value in generate_requirements({extra: reqs}):
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/metadata.py", line 138, in generate_requirements
14:12:26 for new_req in convert_requirements(depends):
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/metadata.py", line 103, in convert_requirements
14:12:26 parsed_requirement = Requirement(req)
14:12:26 File "/usr/local/lib/python3.8/site-packages/wheel/vendored/packaging/requirements.py", line 37, in init
14:12:26 raise InvalidRequirement(str(e)) from e
14:12:26 wheel.vendored.packaging.requirements.InvalidRequirement: Expected end or semicolon (after version specifier)
14:12:26 opencv-python>=3.
14:12:26 ~~~^
14:12:26 [end of output]
14:12:26
14:12:26 note: This error originates from a subprocess, and is likely not a problem with pip.
14:12:26 �[0m�[91m ERROR: Failed building wheel for gym
14:12:26 �[0m Running setup.py clean for gym
building_wheel_gym_error.log

Ubuntu 18.x deprecation

As Ubuntu 18.x is deprecated in Github Actions, or at least there will not be any containers using it, it is probably a good idea to bump to Ubuntu 20.x, or use a separate docker image with Ubuntu 18.x on the fly in the CI.

CI is outdated

The CI script was made for the old solution.

The unit test should involve training in a simulated environment, using different operating systems and python versions.

Strange behaviour when team is full

When deploying the AI against real people, I observed that it tends to sell an item (maybe always the first?). When this happens, it seems to be able to move the first item (which has been sold) to another place, which of course is not possible. When this happens, the AI just starts doing strange things. Perhaps there is something wrong with the engine when animals are sold? Or the machine vision system at fails? Perhaps a longer pause is needed between each action?

Support tensorboard logs

Currently, we are saving training history in a .csv file using the CSVLogger tool. However, it would be great to support saving the history as tensorboard logs, as it is quite commonly used by developers.

If working properly, we may also substitute it with the CSV-solution completely.

Maybe not an issue but

Bot playing 1000% slower than a snail. Also can someone record the bot playing and send me the vid idk if thats my problem or something. Resolution 1920x1080. Fullscreen: yes

Action "buy_food_team" is not handled

Apparently there is an action called "buy_food_team", which I was not aware of. This is not handled by the current deployment script.

Unhashable type numpy error

Hi!
I'm trying to run the script, but I'm getting the numpy error "TypeError: unhashable type: 'numpy.ndarray'". This is on latest version, tried in windows and ubuntu, same issue. Python 3.10 and 3.11. The script seems to see the screen, but fails to continue.
Any idea?
Thanks!

INFO main.py 56 Running...
INFO deploy_agent.py 94 INITIALIZATION [self.run]: Loading Model
INFO deploy_agent.py 97 INITIALIZATION [self.run]: Create SuperAutoPetsEnv Object
INFO deploy_agent.py 104 CV SYSTEM [self.run]: Detect the Pets and Food in the Shop Slots
INFO deploy_agent.py 106 CV SYSTEM [self.run]: Calls [image_detection.find_the_animals]
INFO deploy_agent.py 111 CV SYSTEM [self.run]: The detected Pets and Food in the Shop is : ['fish', 'horse', 'pig', 'honey']
INFO deploy_agent.py 112 GAME ENGINE [self.run]: Set Environment Shop = detected Pets and Food
INFO deploy_agent.py 121 GAME ENGINE [self.run]: Get the best action to make for the given state from the loaded model
Traceback (most recent call last):
  File "/home/pablo/super-ml-pets/main.py", line 63, in <module>
    main()
  File "/home/pablo/super-ml-pets/main.py", line 57, in main
    run(ret)
  File "/home/pablo/super-ml-pets/src/deploy_agent.py", line 128, in run
    " Current Team and Shop \n{}".format(s[action][0]))
TypeError: unhashable type: 'numpy.ndarray'

macOS deployment requires administrator rights

Quite annoyingly, the keyboard library requires sudo rights in order to be used on macOS.

If this is not done, you will get this error message.

The alternatively is to run the script like so: sudo python deploy.py

Even still, now with sudo the program might fail to find the dependencies (observed on macOS).

Nonetheless, this is bad practice. We should not need sudo permissions to deploy the software. We need to find a way to enable keyboard events on macOS (all UNIX-based systems likely have the same issue, e.g., Ubuntu).

Keyboard events require sudo on Ubuntu

Tested deployment on Ubuntu.
Also needed to install tkinter as an additional requirement.

ran it with

sudo su
source venv/bin/activate
xhost +
export DISPLAY=:0.0

Traceback (most recent call last):
File "main.py", line 37, in
pause()
File "/home/xxx/super-ml-pets/src/agent.py", line 22, in pause
if keyboard.read_key() == 'space':
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/init.py", line 935, in read_key
event = read_event(suppress)
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/init.py", line 924, in read_event
hooked = hook(queue.put, suppress=suppress)
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/init.py", line 461, in hook
append(callback)
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_generic.py", line 67, in add_handler
self.start_if_necessary()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_generic.py", line 35, in start_if_necessary
self.init()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/init.py", line 196, in init
_os_keyboard.init()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_nixkeyboard.py", line 113, in init
build_device()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_nixkeyboard.py", line 109, in build_device
ensure_root()
File "/home/xxx/super-ml-pets/venv/lib/python3.8/site-packages/keyboard/_nixcommon.py", line 174, in ensure_root
raise ImportError('You must be root to use this library on linux.')
ImportError: You must be root to use this library on linux.

SAP's UI and animals have updated - machine vision system needs updating

Switching to classical style partly resolves this issue as mentioned in #52 (comment).

This uses the old style of the animals. However, the UI might still be slightly different.

Add method to stop AI

When using the AI with the game during deployment, there is no simple way to stop running the AI without CTRL+C from the terminal. However, as the AI use move events, this might be challenging to do, especially at certain stages.

A good option is to listen for a specific keyboard event, e.g., ESCAPE key press, to stop the AI.

latest sapai not compatible with sapai-gym

As discussed in this thread, sapai-gym and sapai are no longer compatible:
alexdriedger/sapai-gym#11

The simplest way to use sapai-gym is to use an older version of sapai. However, when installing naively, sapai is installed from the master branch. Hence, installation will fail.

I will have a separate branch in forks of both framework which has the stable versions.

Not possible to freeze items

This stems from sapai-gym, which has not added support for it yet:
https://github.com/alexdriedger/sapai-gym/blob/master/sapai_gym/SuperAutoPetsEnv.py#L294

When this is added, we will also need to add support for it in the machine vision system.

Pip install fails in GitHub codespaces

Just attempted to setup the project using GitHub codespaces, and during pip install -r requirements.txt, an error occurred:

Collecting sapai@ git+https://github.com/andreped/sapai.git@update-stats
  Cloning https://github.com/andreped/sapai.git (to revision update-stats) to /tmp/pip-install-eln35jls/sapai_2c0fd4b241034f06956f0a9abf03bed9
  Running command git clone --filter=blob:none --quiet https://github.com/andreped/sapai.git /tmp/pip-install-eln35jls/sapai_2c0fd4b241034f06956f0a9abf03bed9
  Running command git checkout -b update-stats --track origin/update-stats
  Switched to a new branch 'update-stats'
  branch 'update-stats' set up to track 'origin/update-stats'.
  Resolved https://github.com/andreped/sapai.git to commit bdc16215157a294194085be954d38c9a100a9b53
  Preparing metadata (setup.py) ... done
Collecting gym~=0.21.0
  Downloading gym-0.21.0.tar.gz (1.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 290.3 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [1 lines of output]
      error in gym setup command: 'extras_require' must be a dictionary whose values are strings or lists of strings containing valid project/version requirement specifiers.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.

Chrome extension to deploy AI in web app?

A cool idea I just had would be to be able to deploy a trained AI in the web app through a web extension, e.g., chrome.

Initially, I assumed that one would need to write lots of Javascript to do that, or at least setup some sort of microservice, but then I remembered that PyScript is a thing, and maybe can be used to resolve this issue.

Might be challenging with lots of dependencies, but I guess we could try.

Model history is not linked to model

Currently, when training, the training log will be stored in a specific place ./history/sb3_log/.

With how the logging works, if all models are linked to the same file, it will either 1) overwrite it or 2) append to it. It would be better if the history of different models are saved into independent files.

animal pack is outdated

The engine runs on an older version of SAP where for instance a fish was a 3-3 and not a 2-2. We need to update sapai to have the updated numbers, such that the trained model transfers well to the real game.

Add markdown table for supported setups

It is possible to have a markdown table that include which python versions and operating systems are supports (and what is compatible). Futhermore, one could syncronize this table directly to the CI tests. It would be very convenient for the user to see what currently works.

RuntimeWarning

I keep getting this runtime warning. Not sure if this is specific to me or everyone is getting this.

C:\Users\jivit\anaconda3\envs\super\lib\site-packages\skimage\metrics_structural_similarity.py:230: RuntimeWarning: invalid value encountered in divide
S = (A1 * A2) / D

The project still work after this warning.

Bug running deployment on macOS

Just tested the framework on a macOS Monterey (12.3.1) using Python 3.7.9.
Training works just fine, but deployment crashes.
First time running can also be a bit slow, but that is likely due to macOS running some security checks.
Sudo is also required to run the software, likely due to keyboard library being used. The same was observed on Ubuntu.

Error prompt after running deployment:

Traceback (most recent call last):
  File "main.py", line 49, in <module>
    run(ret)
  File "/Users/X/workspace/super-ml-pets/src/deploy_agent.py", line 98, in run
    model = MaskablePPO.load(ret.infer_model, custom_objects=custom_objects)
  File "/Users/X/workspace/super-ml-pets/venv/lib/python3.7/site-packages/stable_baselines3/common/base_class.py", line 705, in load
    path, device=device, custom_objects=custom_objects, print_system_info=print_system_info
  File "/Users/X/workspace/super-ml-pets/venv/lib/python3.7/site-packages/stable_baselines3/common/save_util.py", line 435, in load_from_zip_file
    th_object = th.load(file_content, map_location=device)
  File "/Users/X/workspace/super-ml-pets/venv/lib/python3.7/site-packages/torch/serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/Users/X/workspace/super-ml-pets/venv/lib/python3.7/site-packages/torch/serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\x00'.

The Sloth anomaly

If we take a look at the Pets under standard expansion, there is no sloth. But recently while playing in the standard pack the Pet Sloth appeared (It didn't occur to me to take a screenshot (ㅠ﹏ㅠ) ).
This anomaly may cause the Computer Vision System to crash. Proper exceptions needed to prevent this.

Question

What is the recommended setting for training, and how long have you been training the v6 one?

Strange behaviour after continuing training after crash

After Exception happens, for whatever reason, the ep_raw_mean and ep_len_mean are much higher than usual. Are we properly reseting the environment before restarting the training? Or is there a more important issue? Perhaps the opponents are reset to their poorest state, meaning that after restarting we are playing against easier opponents?

Note that after a while it goes down to a similar level that it was before the crash.

This is the prompt that i got around the Exception:

----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 72.6       |
|    ep_rew_mean          | 13.8       |
| time/                   |            |
|    fps                  | 105        |
|    iterations           | 300        |
|    time_elapsed         | 5833       |
|    total_timesteps      | 614400     |
| train/                  |            |
|    approx_kl            | 0.06014028 |
|    clip_fraction        | 0.185      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.407     |
|    explained_variance   | 0.6        |
|    learning_rate        | 0.0003     |
|    loss                 | 1.53       |
|    n_updates            | 48400      |
|    policy_gradient_loss | -0.0404    |
|    value_loss           | 9.12       |
----------------------------------------
Exception: get_idx < pet-hedgehog 10-1 status-honey-bee 2-1 > not found
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 92.7        |
|    ep_rew_mean          | 33.1        |
| time/                   |             |
|    fps                  | 123         |
|    iterations           | 1           |
|    time_elapsed         | 16          |
|    total_timesteps      | 2048        |
| train/                  |             |
|    approx_kl            | 0.043567862 |
|    clip_fraction        | 0.208       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.48       |
|    explained_variance   | 0.68        |
|    learning_rate        | 0.0003      |
|    loss                 | 5.09        |
|    n_updates            | 48410       |
|    policy_gradient_loss | -0.0425     |
|    value_loss           | 9.02        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 83.9        |
|    ep_rew_mean          | 24.1        |
| time/                   |             |
|    fps                  | 116         |
|    iterations           | 2           |
|    time_elapsed         | 35          |
|    total_timesteps      | 4096        |
| train/                  |             |
|    approx_kl            | 0.041116748 |
|    clip_fraction        | 0.179       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.366      |
|    explained_variance   | 0.631       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.58        |
|    n_updates            | 48420       |
|    policy_gradient_loss | -0.0446     |
|    value_loss           | 16.3        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 83          |
|    ep_rew_mean          | 21.3        |
| time/                   |             |
|    fps                  | 114         |
|    iterations           | 3           |
|    time_elapsed         | 53          |
|    total_timesteps      | 6144        |
| train/                  |             |
|    approx_kl            | 0.056923926 |
|    clip_fraction        | 0.187       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.404      |
|    explained_variance   | 0.585       |
|    learning_rate        | 0.0003      |
|    loss                 | 3.17        |
|    n_updates            | 48430       |
|    policy_gradient_loss | -0.043      |
|    value_loss           | 12.3        |
-----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 83.6       |
|    ep_rew_mean          | 21.4       |
| time/                   |            |
|    fps                  | 112        |
|    iterations           | 4          |
|    time_elapsed         | 72         |
|    total_timesteps      | 8192       |
| train/                  |            |
|    approx_kl            | 0.05702912 |
|    clip_fraction        | 0.185      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.42      |
|    explained_variance   | 0.476      |
|    learning_rate        | 0.0003     |
|    loss                 | 3.65       |
|    n_updates            | 48440      |
|    policy_gradient_loss | -0.0412    |
|    value_loss           | 14.4       |
----------------------------------------
---------------------------------------
| rollout/                |           |
|    ep_len_mean          | 77.3      |
|    ep_rew_mean          | 16.2      |
| time/                   |           |
|    fps                  | 112       |
|    iterations           | 5         |
|    time_elapsed         | 91        |
|    total_timesteps      | 10240     |
| train/                  |           |
|    approx_kl            | 0.0575137 |
|    clip_fraction        | 0.185     |
|    clip_range           | 0.2       |
|    entropy_loss         | -0.396    |
|    explained_variance   | 0.469     |
|    learning_rate        | 0.0003    |
|    loss                 | 4.42      |
|    n_updates            | 48450     |
|    policy_gradient_loss | -0.0437   |
|    value_loss           | 13.3      |
---------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 78.6        |
|    ep_rew_mean          | 17.3        |
| time/                   |             |
|    fps                  | 112         |
|    iterations           | 6           |
|    time_elapsed         | 109         |
|    total_timesteps      | 12288       |
| train/                  |             |
|    approx_kl            | 0.077249065 |
|    clip_fraction        | 0.219       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.47       |
|    explained_variance   | 0.525       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.23        |
|    n_updates            | 48460       |
|    policy_gradient_loss | -0.0489     |
|    value_loss           | 9.37        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 78.3        |
|    ep_rew_mean          | 17.8        |
| time/                   |             |
|    fps                  | 112         |
|    iterations           | 7           |
|    time_elapsed         | 127         |
|    total_timesteps      | 14336       |
| train/                  |             |
|    approx_kl            | 0.048610996 |
|    clip_fraction        | 0.182       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.408      |
|    explained_variance   | 0.645       |
|    learning_rate        | 0.0003      |
|    loss                 | 2.42        |
|    n_updates            | 48470       |
|    policy_gradient_loss | -0.041      |
|    value_loss           | 11.6        |
-----------------------------------------
----------------------------------------
| rollout/                |            |
|    ep_len_mean          | 77.4       |
|    ep_rew_mean          | 17.6       |
| time/                   |            |
|    fps                  | 111        |
|    iterations           | 8          |
|    time_elapsed         | 147        |
|    total_timesteps      | 16384      |
| train/                  |            |
|    approx_kl            | 0.04007852 |
|    clip_fraction        | 0.202      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.448     |
|    explained_variance   | 0.687      |
|    learning_rate        | 0.0003     |
|    loss                 | 2.07       |
|    n_updates            | 48480      |
|    policy_gradient_loss | -0.0456    |
|    value_loss           | 12         |
----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 76.6        |
|    ep_rew_mean          | 17.3        |
| time/                   |             |
|    fps                  | 111         |
|    iterations           | 9           |
|    time_elapsed         | 165         |
|    total_timesteps      | 18432       |
| train/                  |             |
|    approx_kl            | 0.056452066 |
|    clip_fraction        | 0.186       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.419      |
|    explained_variance   | 0.539       |
|    learning_rate        | 0.0003      |
|    loss                 | 5.39        |
|    n_updates            | 48490       |
|    policy_gradient_loss | -0.043      |
|    value_loss           | 17.7        |
-----------------------------------------
-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 75.7        |
|    ep_rew_mean          | 17.1        |
| time/                   |             |
|    fps                  | 110         |
|    iterations           | 10          |
|    time_elapsed         | 184         |
|    total_timesteps      | 20480       |
| train/                  |             |
|    approx_kl            | 0.043880884 |
|    clip_fraction        | 0.188       |
|    clip_range           | 0.2         |
|    entropy_loss         | -0.453      |
|    explained_variance   | 0.543       |
|    learning_rate        | 0.0003      |
|    loss                 | 6.02        |
|    n_updates            | 48500       |
|    policy_gradient_loss | -0.0444     |
|    value_loss           | 14.9        |
-----------------------------------------

Add unit tests for deployment

Right now we have added a test CI for verifying that training works as intended.

For deployment this is more challenging to do as we cannot directly play the game from a CI.

Hence, a suggestion could be to setup a simple virtual play session, where we have screenshots to enable the AI to interact.

Might be that it crashes anyways, but at least we can verify that some of the deployment functionalities works.

Flake8

Should use Flake8 for assessing code quality. It would also be nice to get a coverage badge in the README.

Additional information of requirements needed

I keep running into this error when trying to install the framework.

Namespace(finetune=None, model_name='rl_model', nb_games=10000, nb_retries=1, nb_steps=10000, task='deploy')

Pausing...

Running...
Traceback (most recent call last):
File "main.py", line 36, in
run(path)
File "C:\Users\jivit\Documents\Python_Scripts\RL\super-ml-pets\src\game_interaction\agent.py", line 69,
in run
model = MaskablePPO.load(model_path)
File "C:\Users\jivit\anaconda3\envs\gpu2\lib\site-packages\stable_baselines3\common\base_class.py", line 675, in load
path, device=device, custom_objects=custom_objects, print_system_info=print_system_info
File "C:\Users\jivit\anaconda3\envs\gpu2\lib\site-packages\stable_baselines3\common\save_util.py", line
419, in load_from_zip_file
data = json_to_data(json_data, custom_objects=custom_objects)
File "C:\Users\jivit\anaconda3\envs\gpu2\lib\site-packages\stable_baselines3\common\save_util.py", line
164, in json_to_data
deserialized_object = cloudpickle.loads(base64_object)
AttributeError: Can't get attribute '_make_function' on <module 'cloudpickle.cloudpickle' from 'C:\Users\jivit\anaconda3\envs\gpu2\lib\site-packages\cloudpickle\cloudpickle.py'>

I think it is because I am using a different version of stable baselines 3. Can you include the version of sb3_contrib and python you are using?

Requirements need refactoring

Currently, requirements.txt contain too much noise and too specific versions, which also appear to be Windows-only (or at least some versions). Should relax version strictness.

Fixing this will enable the framework to work with all operating systems and python versions again.

Reduce project size

Right now the entire repo is about 327 MB, which is way higher than the recommended maximum size of 100 MB.

This also means that it takes very long to clone.

Remove redundant scripts

Currently, the source code is packed with methods and full scripts that are not used.

We should just remove most of these, as most of these are from the old solution which is quite different from the current framework.

We will also always have the commit history. Hence, we can find these old methods and scripts if necessary.

Swap problem

The bot mainly swap position of pets. It barely do anything else. You should add a limit to the amount of times the bot allowed to swap pet position so it actually does something else than swapping

Add unit test for history plotter

We have a unit test to assess whether a simple training passes.

When it is done it would be great if we could check if we are able to read from the generated log file and plot the result.

We do not necessary care what it plots as long as the entire scripts runs without error.

Another training method

This is probably impossible but it would be good if we had a way of training the ai by playing or train with video.

Most functions and classes lack documentation

We should add descriptions on what the different functions and classes do.

This should be done in a standardized manner. Probably something like this, that PyCharm suggests:

def test_function(p1, p2, p3):
    """
    test_function does blah blah blah.

    :param p1: describe about parameter p1
    :param p2: describe about parameter p2
    :param p3: describe about parameter p3
    :return: describe what it returns
    """ 
    pass

Training seem to crash occasionally

When training RL models using sapai-gym, different errors tend to occur.

I have tried to uses try-expect blocks, but the problem is that if this happens, training using standard baseline 3 crashes, and we will have to start all over again.

We should therefore either: 1) fix what is bugged in sapai/sapai-gym or 2) add a wrapper function that catches when this fails, and tries to generate a new one (if possible).

andreped / super-ml-pets Goto Github PK

super-ml-pets's Issues

Recommend Projects

Recommend Topics

Recommend Org