s-sd / spurl Goto Github PK
View Code? Open in Web Editor NEWSelf-Play Using Reinforcement Learning (SPURL)
License: MIT License
Self-Play Using Reinforcement Learning (SPURL)
License: MIT License
; bank of networks is saved in temp, which may be cleared after training
Add a descriptive readme to outline functionality of the repository
multi discrete can be using different network branches or different networks
For reinforce discrete, add a mode for deterministic predictions
Can do this only inside the update function with the help of an additional value function which learns to approximate rewards. First update value function using a simple loss, then update model.
Create new discrete demo and test out build_policy_network functions
Create entry-points, in core, and utils functions for interacting with spurl
Add continuous actions (Gaussian) - sample action in select_action function needs to sample from Gaussian (similar to compute loss)
add inference mode to only sample deterministically, i.e., std = 0 at inference
options include: conv or fully connected
should take: input shape and output shape and continuous/ discrete/ multi-discrete
should output a constructed network
Add sequential 1v1 working demo with tic tac toe/ noughts and crosses
pong 2 player version in retro gym
A minimal working example with a dummy environment
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.