The diamond from eloialonso

View Code? Open in Web Editor NEW

DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model.

License: MIT License

Shell 0.07% Python 99.93%

diamond's People

Contributors

Stargazers

Watchers

diamond's Issues

About training time

Hi, thanks for such a great job!
You mentioned in your paper that "As a reference, unconstrained Atari agents are usually trained for 50 million steps" and "Each run utilized around 12GB of VRAM and took approximately 2.9 days on a single Nvidia RTX 4090".

I noticed that in your code, the outer loop that calls self.train_agent() is called 1000 times. The inner loop that trains the agent self.train_component(name, steps) is called 5000 times.
This is equivalent to the agent training 1000*5000 steps. The time corresponding to the inner loop is [496/5000 [12:59<1:09:31, 1.08it/s] about 1 hours.
So I guess the total time is about 1000 hours, and this is just for training agent model.
Did I overlook something ? And what should I do to reproduce the training time? @eloialonso @AdamJelley

eloialonso / diamond Goto Github PK

diamond's People

Contributors

Stargazers

Watchers

Forkers

diamond's Issues

About training time

can you provide the code？

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent