Currently each time we .learn() it starts a new curve on tensorboard. This makes cont

Hello, I think what <a class="user-mention notranslate" data-hovercard-type="user"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Tensorboard: Continue training curves about stable-baselines HOT 15 CLOSED

hill-a commented on July 17, 2024 6

Tensorboard: Continue training curves

from stable-baselines.

Comments (15)

hill-a commented on July 17, 2024 1

Hi, thank you for the issue and sorry for the delayed answer.

The issue here is the .learn function will keep track of the step value internally. However in this use case, it might be good to add a keyword argument to explicitly reset the step value for tensorboard when calling .learn.

Will make a fix, and update the documentation in the next few days.

from stable-baselines.

araffin commented on July 17, 2024 1

Hello,
I think what @hill-a meant was to have a keyword that would allow to reset the global counter, i.e, something like:
model.learn(1000, reset_step_counter=True)
And yes, you will need to define a global_step_counter variable.
Feel free to submit a PR and to ask if you need help ;) (hill-a was supposed to work on that but he seems quite busy right now)

from stable-baselines.

pathway commented on July 17, 2024

This will be super helpful!
If anyone has a pointer or hint on how to explicitly reset the step value for tensorboard I might do it myself. This is a constant issue for me, its like flying blind.

from stable-baselines.

bertram1isu commented on July 17, 2024

If no one else is working on this, I might take a crack at it. I have a similar issue and was about to hack it in my local version... seems others could use the fix as well. Any objections?

from stable-baselines.

bertram1isu commented on July 17, 2024

The other thing I'm after is a consistent way to do checkpointing... if I fix this by making the local variables within train part of the class instance variables so that they retain state across training calls, I'm also going to be looking for any other necessary variables so that when the save/load functions are called, I would be able to pick up training where I left off in the spirit of tensorflow checkpointing.

Would you prefer these are addressed in two separate issues?

Also, whatever changes I made I was looking primarily at DDPG... the way the train functions are set up, they seem to be per algorithm... this would mean I'd need to make the same change in each algorithm. Seems like an indicator that there's probably a smarter way. Has anyone given this some thought and have a better suggestion on where to make these changes?

from stable-baselines.

araffin commented on July 17, 2024

@bertram1isu yes you can work on that ;)

For the other question, it is not super clear to me what you want to do, please make a separate issue.

from stable-baselines.

jrjbertram commented on July 17, 2024

I ended up finding a workaround to this problem that partially solves it.

The openai baselines code within it's logger module contains support for tensorboard logging. The stable baselinses code still retains this same logging code. You can activate it via:

    from stable_baselines import logger
    print( 'Configuring stable-baselines logger')
    logger.configure()

To control the location where the logs are stores, set the OPENAI_LOGDIR environment variable to a location on your file system. To control the formats of data that are logged (and to enable tensorboard logging), set the OPENAI_LOG_FORMAT environment variable to "stdout,tensorboard".

This form of tensorboard logging does fine across multiple calls to training and yields the same statistics as openai baselines. (Useful for comparing performance across the two forks.)

Here's a comparison of an algorithm running on an environment but with different numbers of timestamps per learning call (1e5, 1e6, 1e9).

And a second screenshot of a different part of the tensorboard display:

These displays show consistent results across multiple calls to train the agent against the environment. (Evident by the sawtooth looking plots in episodes plot.)

More complete snippet that I'm using right now:

basedir = '/some/directory'

try:
    os.makedirs(basedir)
    print("Directory " , basedir ,  " created ")
except FileExistsError:
    pass

os.environ[ 'OPENAI_LOGDIR' ] = basedir
os.environ[ 'OPENAI_LOG_FORMAT' ] = 'stdout,tensorboard'

from stable_baselines import logger
print( 'Configuring stable-baselines logger')
logger.configure()

Full code for reference:
https://github.com/jrjbertram/jsbsim_rl/blob/d65d63fe5e3b4e8ac9be580744b0242ab86eafee/compare.py

from stable-baselines.

araffin commented on July 17, 2024

@jrjbertram thanks for your comment, but I think this issue is more about the new stable-baselines tensorboard logging (used when tensorboard_log is passed), not the legacy one.

from stable-baselines.

RGring commented on July 17, 2024

I would like to save a status of the model, completely stop the training procedure, and continue at a later point (regarding tensorboard curve). Is that possible at the moment? I guess the timestep needs to be saved and reloaded for num_timesteps.

from stable-baselines.

araffin commented on July 17, 2024

To answer your question: yes you can already do that but it won't be perfect when training again after loading.

See issue #301 and documentation: https://stable-baselines.readthedocs.io/en/master/guide/tensorboard.html

from stable-baselines.

Gaoyuan-Liu commented on July 17, 2024

Hey @araffin,
If I understand right, this issue has already been solve and added to the main branch. So I follow the instruction in Tensorboard Integration, but whatever I put in the tensorboard_log augment, it will create a new folder and start a new log file for tensorboard.
My code:
model = PPO.load("ppo_panda", env=env, tensorboard_log="./tensorboard/PPO_22")
model.set_env(env)
model.learn(total_timesteps=5000)
model.save("ppo_panda")
In my understanding, it should continue and extend the previous tensorboard file, right? Did I miss any steps?

Thanks!

from stable-baselines.

Miffyli commented on July 17, 2024

@Gaoyuan-Liu I do not think there is a solution merged to SB2, but for SB3. I recommend you try migrating over to SB3 as it is more actively supported and comes with additional fixes.

from stable-baselines.

Gaoyuan-Liu commented on July 17, 2024

@Miffyli Indeed, I found the function there, thanks!

from stable-baselines.

araffin commented on July 17, 2024

In my understanding, it should continue and extend the previous tensorboard file, right? Did I miss any steps?

if you look at SB2/SB3 doc, you are missing reset_num_timesteps=False.

from stable-baselines.

Gaoyuan-Liu commented on July 17, 2024

@araffin True, and I just found even though each time I run model.learn it will create a new folder for tensorboard, which contains new logging data, and the tensorboard plot will be segmented.
But if I manually put the binary file into one folder and run tensorboard, it will plot one continuous line with the data from multiple binary files. So it will be more similar to "the training never stopped".
Thanks!

from stable-baselines.

Tensorboard: Continue training curves about stable-baselines HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent