The shologuti from ehtnamuh

#2 Cannot pause the game while training using built in PPO algorithm

While training using mlagents-learn in the unity editor, the pause function gets ignored and the game keeps reseting every few steps

#9 Implement TDGutiAgent Class and make GutiAgent Abstract

Make GutiAgent an Abstract class. Needed for code extensibility and adding other types of agents in the future.

issue#13 Create ObservationGenerator Class

Create observation generator class. Will have instance of board and will serve GutiAgent types with specific observations.
The class should be abstract or an interface as it is intended for extension.

#16 Make Player Manager Class

Make Manager class for storing players. Manager class can have exposed variables in the editor to allow designers to easily change what type of players they wish to spawn.
The game manager class can have a reference of the player manager class.
The player manager class will also be useful once a Settings class is created to control the major settings of the game during play.

#3 GameManager Class is too heavy.

GameManager class needs to be refactored and separated our into more classes.
Possible Solution: Separate out a Scoreboard class and a RuleBook class to handle the management of scoring and rule checking.

#7 Store MinMax Simulated move evaluations in a list

Store Min Max agents evaluations in a list. Will help in taking more random actions.
Edit the MinMax function in the MinMaxAI class
If a new state is encountered with the same as the max value, push that move in a MaxValue List
If then, a state is encountered that has a higher value then the previous known max value,
Empty the MaxValue List and then push the newly found move

#1 Restarting game randomly while training using built in PPO

Issue#12 Make RuleBook Class usable across multiple concurrent instances

Possible Solution
Force calling class to supply RuleBook Class with an instance of the board where it wants to test the rule.

#issue 11 Create action mask for PPOAgent Class

Possible solution:
Fetch GutiArray from the board.board class, and use it to calculate a list of indexes that are non traversable using the generated moveList from simulator class.

issue#14 Separate out GameManager variables into Scriptable object

do it

#5 Simulating the environment hard coupled to MinMax Class, and heavy

Possible Solution:
Create simulator class,
The class access to the logical guti board (guti_map) and can map Address objects to their relevant representations
in a 1d array. The move and reverseMove functions can be rewritten to manipulate the 1d array. By directly manipulating the
array instead of creating the array everytime, the performance of simulating the game should improve significantly.

Since the agents are trained from green perspective, when fetching observations from the ObservationGenerator class,
simply reverse the observation list.
besides reversing the observation, the move indexes must also be translated from Green Player perspective to Red Player Perspective everytime.
this will lead to a lot of custom tightly coupled code.

ehtnamuh / shologuti Goto Github PK

shologuti's People

Contributors

Watchers

shologuti's Issues

Recommend Projects

Recommend Topics

Recommend Org