shologuti's People
shologuti's Issues
#2 Cannot pause the game while training using built in PPO algorithm
While training using mlagents-learn in the unity editor, the pause function gets ignored and the game keeps reseting every few steps
#9 Implement TDGutiAgent Class and make GutiAgent Abstract
Make GutiAgent an Abstract class. Needed for code extensibility and adding other types of agents in the future.
issue#13 Create ObservationGenerator Class
Create observation generator class. Will have instance of board and will serve GutiAgent types with specific observations.
The class should be abstract or an interface as it is intended for extension.
#16 Make Player Manager Class
Make Manager class for storing players. Manager class can have exposed variables in the editor to allow designers to easily change what type of players they wish to spawn.
The game manager class can have a reference of the player manager class.
The player manager class will also be useful once a Settings class is created to control the major settings of the game during play.
#3 GameManager Class is too heavy.
GameManager class needs to be refactored and separated our into more classes.
Possible Solution: Separate out a Scoreboard class and a RuleBook class to handle the management of scoring and rule checking.
#7 Store MinMax Simulated move evaluations in a list
Store Min Max agents evaluations in a list. Will help in taking more random actions.
Edit the MinMax function in the MinMaxAI class
If a new state is encountered with the same as the max value, push that move in a MaxValue List
If then, a state is encountered that has a higher value then the previous known max value,
Empty the MaxValue List and then push the newly found move
#1 Restarting game randomly while training using built in PPO
Issue#12 Make RuleBook Class usable across multiple concurrent instances
Possible Solution
Force calling class to supply RuleBook Class with an instance of the board where it wants to test the rule.
#issue 11 Create action mask for PPOAgent Class
Possible solution:
Fetch GutiArray from the board.board class, and use it to calculate a list of indexes that are non traversable using the generated moveList from simulator class.
issue#14 Separate out GameManager variables into Scriptable object
do it
#5 Simulating the environment hard coupled to MinMax Class, and heavy
Possible Solution:
Create simulator class,
The class access to the logical guti board (guti_map) and can map Address objects to their relevant representations
in a 1d array. The move and reverseMove functions can be rewritten to manipulate the 1d array. By directly manipulating the
array instead of creating the array everytime, the performance of simulating the game should improve significantly.
#4 Adjust Reward Policy
Reward policy needs fine tuning from the Unity side.
issue#15 create a AgentMoveFilterClass
AgentMoveFilterClass handles the perspective of the board seen by the Agent and the also converts the moves generation accordingly
Issue #17 Build a Settings page
Build a Settings page to control environment settings
#10 Implement PPOGutiAgent class, extending the GutiAgent class
Do this only after solving Issue#9.
PPOGutiAgent takes same input but instead of a state value
returns vector indicating selected action.
this class may also need a CheckMoveValidity function.
#8 Enable 2-ply or 3-ply search to be performed by the GutiAgent, using its Neural Network
Possible Solution: add an exploration depth variable.
After sending all simulated states to the NN, select the 4 or fewer states with the highest values and expand them
and repeat the initial process of feeding into the NN for every selected state
#6 Player Class is too cluttered
Player class has too many if else conditions and specialized code for player types
Possible solution:
Create a base Player Class interface
Inherit and create separate
RLAPlayer
MinMaxPlayer
HumanPlayer
Issue#14 Create filter to enable same trained network to be used for both Red and Green Player
Possible Solution:
- Since the agents are trained from green perspective, when fetching observations from the ObservationGenerator class,
simply reverse the observation list. - besides reversing the observation, the move indexes must also be translated from Green Player perspective to Red Player Perspective everytime.
- this will lead to a lot of custom tightly coupled code.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.