Hi ! Very cool lib, it's super pleasing to see so many rl-algorithms within reach ! I'

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Best way to test an agent ? about lagom HOT 4 OPEN

MoMe36 commented on June 3, 2024

Best way to test an agent ?

from lagom.

Comments (4)

zuoxingdong commented on June 3, 2024 1

Hi @MoMe36 , thank you for your positive feedback ! Regarding your questions:

Can you explain what do you mean visual inspection here ? For current implementation, it checkpoints the model a few times with .pth file, and the file name is the number of iterations. If you mean generating video animation by execution a checkpoint file for one episode, this is done in baselines/plot.ipynb file, in the second cell, there is make_video function.

To make it more clear how the files are structured when one uses run_experiment:
Suppose one defines experiment name as 'default' in the run_experiment function, and there is no configuration sweeping, i.e. single configuration (it means no Grid/Random object in the config), then a single unique job ID is number 0, and it has 3 random runs with different random seeds, say [123, 456, 789].

Then the file structure under logs looks like:

- logs
    - default  # experiment name
        - 0  # job ID
            - 123  # random seed
            - 456
            - 789

Under each seed leaf folder, all loggings, checkpoints are stored. To generate a video animation by executing one checkpoint file, one could load simply call make_video function in the plot.ipynb file with corresponding arguments, all the file loadings, action selection, generate mp4 files are done for you internally.

For current status, the checkpointing mechanism is implemented, in the file experiment.py for each algorithm, you can see checkpoint.num in the config object to define how many checkpoint files to generate during entire training.
- However, the resume training functionality is not supported yet for now. Although this should not be too painful to add, but because of the philosophy of this repo is to provide research-friendly code with minimialism and easy-to-modify. I am still hesitating what is the best way to implement it with minimal change plus good user-experience and low coding complexity. But for sure, this would be very nice functionality to have.

Hope it helps, and don't hesitate to discuss together if you have further questions.

from lagom.

MoMe36 commented on June 3, 2024

Hi ! Thanks for the quick and detailed answer! I did figure out how to use the make_video method which is exactly what I was looking for. However, I'm not able to find the checkpoints and logging.
I do have the folder structure you're describing, but when I check within a seed folder, I find only agent_1.pth (along with obs_moment.pth in the case of PPO). And the agent really doesn't perform as it should, given its episode returns.

I checked the config file within which I specified the log freq to be at 10, but I don't see any other checkpoint beside the one I mentionned, even after 200 iterations.

Do you have any idea what I'm doing wrong ? Thanks a lot (:

from lagom.

MoMe36 commented on June 3, 2024

Alright, I get it ! Checkpoints are saved only at the end of training, it seems. Thanks anyway !

from lagom.

zuoxingdong commented on June 3, 2024

Hi @MoMe36 , for PPO, it's better to use logs/default folder, because other logging folders are temporary (i.e. they might not have all checkpoints).

log_freq: it only controls how frequently to dump loggings to the screen only.
The checkpointing is controlled by checkpoint.num, e.g. for checkpoint.num=3, it firstly checkpoints before the training, and second in the middle of the entire training, and third one at the end of the training. If you want more checkpoints, you can simply increase the integer in the config object.

Hope it helps !

from lagom.

Best way to test an agent ? about lagom HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent