Coder Social home page Coder Social logo

simionsoft / simionzoo Goto Github PK

View Code? Open in Web Editor NEW
37.0 17.0 25.0 254.29 MB

A workbench for online model-free Reinforcement Learning on continuous control problems

License: Other

C++ 91.17% C 0.13% C# 7.12% F* 0.08% Shell 1.33% Roff 0.01% HTML 0.17%
reinforcement-learning cntk continuous-control distributed-systems windows linux

simionzoo's People

Contributors

alegd avatar aserrin55 avatar azure-pipelines[bot] avatar borjafdezgauna avatar borjafdezgauna2 avatar ladm2110 avatar rafahperez avatar utercero avatar zimmerrol avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simionzoo's Issues

Output Log directory

the output directory where logs are saved should be within the directory where the experiment file is

Distributed execution

HerdClient / HerdAgent:

  1. Now, the client sends a broadcast via UDP, and herd agents connect via TCP. Maybe it would be better if herd agents asnwered via UDP (saying how many cores it has), and then the client (AppXML) connects using TCP if needed.
  2. So far, only one job per herd agent. It should depend on the number of cores.
  3. Check that the output files have been correctly generated before sending them back.
  4. Pipes should be renamed to connect the executables run by the herd agents and the main app (AppXML)
  5. Once it works properly, the test-agent should go as a service
  6. A service installer that sets the firewall rules (bat or exe)

Use the DLL to check input/output file conflicts in AppXML

The function to be called:

int getIOFiles(const char* xmlFilename,char* pBuffer,int bufferSize)

xmlFilename: path to the config xml file
pBuffer: buffer of memory allocated by the caller (XMLApp). 10Kb of memory should be more than enough
bufferSize: size allocated for the buffer

Test/correct INAC

Reproduce the experiments in Model-Free Reinforcement Learning with Continuous Action in Practice to test the INAC implementation

Save Batch / load batch and output directory structure

Poner un botón para salvar y cargar el árbol entero de experimentos. Al lado de run, Save Batch / Load Batch p.ej., Además me he dado cuenta de que no tiene sentido tener los logs en un directorio y en otro los parámetros de los experimentos.

Podrías por favor generar los experimentos siguiendo esta estructura?

El árbol de experimentos en:
../experiments//experiment.xml

En este xml estaría la jerarquía de nodos y el path a cada uno de los experimentos:

//el nombre del root node sea el que sea //sólo tienen Path y directorio de salida las hojas

Y dentro de la carpeta:

../experiments/root/experiment.xml (el batch/árbol de experimentos)
/child3/child3.xml
/child4/child4.xml
/child2/child2.xml

y los logs los generaría yo dentro de estas carpetas (la carpeta la sacaré del path que me pasas al xml del experimento)

Multi-file-path values

They all get the last value:

        <Training-Wind-Data>../config/world/wind-turbine/TurbSim-12.hh</Training-Wind-Data>
        <Training-Wind-Data>../config/world/wind-turbine/TurbSim-12.hh</Training-Wind-Data>
        <Training-Wind-Data>../config/world/wind-turbine/TurbSim-12.hh</Training-Wind-Data>
        <Training-Wind-Data>../config/world/wind-turbine/TurbSim-12.hh</Training-Wind-Data>
        <Training-Wind-Data>../config/world/wind-turbine/TurbSim-12.hh</Training-Wind-Data>
        <Training-Wind-Data>../config/world/wind-turbine/TurbSim-12.hh</Training-Wind-Data>
        <Training-Wind-Data>../config/world/wind-turbine/TurbSim-12.hh</Training-Wind-Data>

example: experiments/examples/wt-vidal-80.node

Set as Null

Invert the logic: Use (marked true by default)

Mark it false if the node doesn't exist in the xml file when loading

Different extensions for experiments and batches

Experiments and batches (experiment tree) should have a different extension to avoid loading an experiment as an experiment tree and the other way around.

-.exp.xml for experiment files
-.bat.xml for experiment trees

??

Pipe names

Name pipes consecutively to avoid duplicated names. For example: pipe1, pipe2, ...

Re-structure episode definitions

The specifics of the episodes (set-point files, number of episodes) should be abstracted out from the user and defined in the world definition xml. Then in the configuration, the user needs only select which experiment type to run, i.e. "random wind data file" for training, and "....10.25.hh" for evaluation

This data should be taken out of the otherwise "stationary" dynamic-model class

Tree/node names

Apparently, node names can't start with a number (when a tree is saved). Could this be checked in AppXML?

Can the root name in a tree be substituted with the app's name? mi culpa

Copy/paste configuration

Setting an experiment would be easier If we could copy/paste part of an experiment configuration. For example, using Actor-Critic one could right-click on the RBF-Gaussian parameterization of the actorm copy (we could copy the xml of the branch to the clipboard), and the right-click and paste on the same branch of the critic.

Kill all

Close all processes/pipes launched/opened when the process manager is forced to close

C++ runtime DLL

Find a way to determine which one is being used or link it statically if it works

Define experiment templates

Instead of providing all the parameters every time we want to run an experiment, define classes that inherit from CExperiment and simplify the configuration. For example:

-"Learn": all the parameters must be defined. Training and evaluation episodes
-"Evaluate": just a single evaluation episode

Reconfigure the base experiment files

For the three main worlds: wind-turbine, underwater-vehicle and pitch-control

First, ControllerToVFAPolicy to approximate the base controller (export-uv-5000.xml)
Then RLSimion to improve the policy

Modify the behavior after clicking run button

If the tree has not been modified, just lauch the experiments.
If the tree has been modified, ask for .tree name (a node has been added, a node has been removed, a leaf has been modified)
If the tree has not been loaded, ask for .tree name.

XML configuration and validation

Instead of using .txt files, XML files should be used to configure experiments and batches of experiments. All the applications should read XML files and validate them, basically substituting the CParameters class by some other class. If possible, with the same interface to minimize changes in the code.
First, configuration files a tool is needed to automatically convert existing configuration files to XML following the XML schemas.

Messages in the Process Manager

En el process manager, como mando cada poco tiempo el progress, se sobreescriben tan rápido el resto de mensajes que no da tiempo a verlos. Podría ponerse un combobox o textbox a la izquierda como el que está para el estado y otro a la derecha multiline para poder hacer scroll sobre los mensajes de log que no sean de tipo

Remove static member variables

They work fine when a process is created and ended, but it doesn't when several apps are created/destroyed in the same program: the destructor is not called until the end

Save the .tree file

When a tree is run, the .tree file is not saved although the program asks for its name

Tile-coding

Implement a tile-coding feature map (required for the experiments in Model-Free Reinforcement Learning with Continuous Action in Practice)

Restructure RLSimion / Badger

Both applications should be merged using a GUI. This application could run one experiment (RLSimion) or a batch of experiments (Badger). Parameters could ideally be branched using foreach and groups.

Feature maps

Should generic feature maps made explicitly StateFeatureMaps or ActionFeatureMaps?
Only derived classes make this explicit

Resume

Remove all the resume stuff and simplify the interface: just a button (with the name of the branch??)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.