Coder Social home page Coder Social logo

Comments (5)

michaelschaarschmidt avatar michaelschaarschmidt commented on June 23, 2024

Hi,

so with regards to algorithms, we prioritise towards what we see as having the potential to becoming a new 'standard' method. That means we won't add every new paper but rather well established methods or new approaches that seem like natural progress (e.g. parameter space noise/noisy net seems would be a sensible addition even without waiting for a year to see if it's replaced). DDPG could certainly be added, another example would be a hybrid policy gradient/DQN method like PGQ.

The very immediate next tasks are a bit more mundane (see the other issues): fix some small config issues, improve logging and im/export, introduce a generic natural actor critic as a basis for TRPO variants, docker and benchmarking. So in general, we will prioritise making the existing code cleaner, more robust and reliable before adding more features. In particular, we feel that adding more algorithms is much easier if we really focus on getting the modularisation right. Hope that answers your query and of course we would assist you in incorporating DDPG if you urgently need it.

from tensorforce.

ViktorM avatar ViktorM commented on June 23, 2024

Thanks for such a detail reply, Michael!

Yes, I think DDPG can already be called a 'standard' and well established method starting from the benchmark paper. The first paper on dexterity manipulations suggests just a small extention of it and in addition to make it asynchronous. I can start working on its implementation if you accept contributions and can support a bit with following your test and code standards.

And if you can add parameter noise variants of TRPO and other algorithms it will be a great news too!

And one more question - do you have any plans about a PPO implementation as well? It's very similar to TRPO and distributed version of it was used in recent Deepmind parkour locomotion paper: https://arxiv.org/abs/1707.02286 ?

from tensorforce.

michaelschaarschmidt avatar michaelschaarschmidt commented on June 23, 2024

I agree that DDPG is established enough to warrant addition, we just have not gotten to it yet because there are so many things to do on the general structure (I'd argue refactoring the optimisation package as to make TRPO fit in more naturally should be very high on that list). So if you would want to contribute DPPG that is very welcome as long as it integrates into the modularisation and coding style.

So device execution semantics are a whole separate issue. From our point of view, there are many approaches on asynchronous, thread-parallel, and distributed process execution and there is not much systematic analysis on how to choose amongst them for certain problems, it's often more a 'obviously collecting more data works better and here is how many actors worked best for our problem'. What I personally would want are execution wrappers around the model that implement different approaches of data collection and device execution (Gorila, A3C, GA3C, PAAC, ..) so we can more systematically compare and analyse this, but doing this well is really difficult and will take a lot of time. It's also something I am personally very interested in but as we are doing this on the side of our PhDs it's hard to give timelines.

from tensorforce.

AlexKuhnle avatar AlexKuhnle commented on June 23, 2024

Just wanted to add that Gitter is probably best for support regarding contributions like DDPG, and yes, we're happy to be of help and give guidance.

from tensorforce.

michaelschaarschmidt avatar michaelschaarschmidt commented on June 23, 2024

Implemented PPO (which you mentioned), think generally this can be closed

from tensorforce.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.