eugenevinitsky / sequential_social_dilemma_games Goto Github PK
View Code? Open in Web Editor NEWRepo for reproduction of sequential social dilemmas
License: MIT License
Repo for reproduction of sequential social dilemmas
License: MIT License
Agent actions are currently stepped through one by one regardless of how they may interact with other actions.
For example, if an agent is at [1,1] with the intent of moving right and there is an agent at [2,2] with the intent of moving right, if agent at [1,1] moves first this move will be disallowed as there is already an agent there.
The order of events is:
(1) Move
(2) Fire
(3) Spawn
However, we currently check in the apple spawn method if a firing beam is in the apple spawn point, and if so, we don't spawn an apple there. However, this leads to two bugs:
(1) Apples can't be spawned in points temporarily obscured by a firing beam
(2) We check if an agent is currently in that spot before spawning an apple, however, an agent COULD be there and is just temporarily obscured by the beam.
gitignore:
pycache
eggs
Can probably use a standard gitignore.
env.render should use matplotlib to show the results. It currently appears to do nothing.
README.md should describe the six methods every subclass of MapEnv needs to implement.
In cleanup the spawn probabilities seem to not quite work correctly; the waste spawns too fast.
Map update methods are currently split in a very ad-hoc way between HarvestEnv and MapEnv. As much of the update process should be moved into MapEnv.
Descriptions of harvest and cleanup can be found at:
https://arxiv.org/pdf/1810.08647.pdf
Agents observing other agents need to know which way they are facing.
We need to shade the colors of the agents slightly (or some other scheme) to indicate orientation
Run script is up on the example_script branch in the run_script folder.
Remaining part is to figure out how to set the filters correctly for the ConvNet.
@natashamjaques I've implemented a model from the paper; it is in the models/causal_to_fc_net.py. It would be good if you could check it over and see if it looks right to you. It'd be a tragedy if down the line we realized I misunderstood something.
Custom models must be registered to be re-used in RLlib via: ModelCatalog.register_custom_model("", ). However, when we replay things we don't have this information anymore. Consider storing this as a tune function in the env config so that it can be recreated later.
...before it's too late.
@natashamjaques do we just overlay a new color that indicates agent + firing beam, does the firing beam temporarily obscure the agent, or does the agent stay with color unchanged but a point gets subtracted? There's some choice here and I'm curious what ya'll did.
Currently, a firing beam will cause the apple to disappear. Is this the correct behavior?
I'm currently picking harvest vs. cleanup by actually changing it in the code in rollout.py, this should be a command line arg.
function SoftHuangpu:postUpdate(gameState)
local wasteDensity = 0
if self._config.potentialWasteArea ~= 0 then
wasteDensity = 1 - self._state.permittedLemonCells:size() /
self._config.potentialWasteArea
end
if wasteDensity >= self._config.thresholdDepletion then
self._state.lemonSpawnProbability = 0
self._state.appleRespawnProbability = 0
else
self._state.lemonSpawnProbability = self._config.lemonSpawnProbability
if wasteDensity <= self._config.thresholdRestoration then
self._state.appleRespawnProbability = self._config.appleRespawnProbability
else
-- Interpolate.
self._state.appleRespawnProbability = (1 - (
wasteDensity - self._config.thresholdRestoration) / (
self._config.thresholdDepletion - self._config.thresholdRestoration)) *
self._config.appleRespawnProbability
end
end
super.postUpdate(self, gameState)
end
The hyperparameters are:
thresholdDepletion = 0.4
thresholdRestoration = 0.0
lemonSpawnProbability = 0.5
appleRespawnProbability = 0.05
Need to convert internal manipulated map environment into an appropriate color state for the agent.
Construct the following model in the run scripts (https://arxiv.org/pdf/1810.08647.pdf section 6.3):
a single convolutional layer with a kernel of size 3, stride of size 1, and 6 output
channels. This is connected to two fully connected layers of size 32 each, and an LSTM with 128 cells.
Currently the orientation of the axes is i.e. the diagram of moving in the ascii array looks like this
^
|
D
O
W
N
RIGHT*LEFT
U
P
|
∨
This is convenient for coding but will need to be flipped when actually displaying.
You can run each of the run_scripts for 50 steps just to check that they are not broken by any changes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.