Coder Social home page Coder Social logo

Comments (15)

hill-a avatar hill-a commented on July 17, 2024 1

Hi, thank you for the issue and sorry for the delayed answer.

The issue here is the .learn function will keep track of the step value internally. However in this use case, it might be good to add a keyword argument to explicitly reset the step value for tensorboard when calling .learn.

Will make a fix, and update the documentation in the next few days.

from stable-baselines.

araffin avatar araffin commented on July 17, 2024 1

Hello,
I think what @hill-a meant was to have a keyword that would allow to reset the global counter, i.e, something like:
model.learn(1000, reset_step_counter=True)
And yes, you will need to define a global_step_counter variable.
Feel free to submit a PR and to ask if you need help ;) (hill-a was supposed to work on that but he seems quite busy right now)

from stable-baselines.

pathway avatar pathway commented on July 17, 2024

This will be super helpful!
If anyone has a pointer or hint on how to explicitly reset the step value for tensorboard I might do it myself. This is a constant issue for me, its like flying blind.

from stable-baselines.

bertram1isu avatar bertram1isu commented on July 17, 2024

If no one else is working on this, I might take a crack at it. I have a similar issue and was about to hack it in my local version... seems others could use the fix as well. Any objections?

from stable-baselines.

bertram1isu avatar bertram1isu commented on July 17, 2024

The other thing I'm after is a consistent way to do checkpointing... if I fix this by making the local variables within train part of the class instance variables so that they retain state across training calls, I'm also going to be looking for any other necessary variables so that when the save/load functions are called, I would be able to pick up training where I left off in the spirit of tensorflow checkpointing.

Would you prefer these are addressed in two separate issues?

Also, whatever changes I made I was looking primarily at DDPG... the way the train functions are set up, they seem to be per algorithm... this would mean I'd need to make the same change in each algorithm. Seems like an indicator that there's probably a smarter way. Has anyone given this some thought and have a better suggestion on where to make these changes?

from stable-baselines.

araffin avatar araffin commented on July 17, 2024

@bertram1isu yes you can work on that ;)

For the other question, it is not super clear to me what you want to do, please make a separate issue.

from stable-baselines.

jrjbertram avatar jrjbertram commented on July 17, 2024

I ended up finding a workaround to this problem that partially solves it.

The openai baselines code within it's logger module contains support for tensorboard logging. The stable baselinses code still retains this same logging code. You can activate it via:

    from stable_baselines import logger
    print( 'Configuring stable-baselines logger')
    logger.configure()

To control the location where the logs are stores, set the OPENAI_LOGDIR environment variable to a location on your file system. To control the formats of data that are logged (and to enable tensorboard logging), set the OPENAI_LOG_FORMAT environment variable to "stdout,tensorboard".

This form of tensorboard logging does fine across multiple calls to training and yields the same statistics as openai baselines. (Useful for comparing performance across the two forks.)

Here's a comparison of an algorithm running on an environment but with different numbers of timestamps per learning call (1e5, 1e6, 1e9).

image

And a second screenshot of a different part of the tensorboard display:
image

These displays show consistent results across multiple calls to train the agent against the environment. (Evident by the sawtooth looking plots in episodes plot.)

More complete snippet that I'm using right now:

basedir = '/some/directory'

try:
    os.makedirs(basedir)
    print("Directory " , basedir ,  " created ")
except FileExistsError:
    pass

os.environ[ 'OPENAI_LOGDIR' ] = basedir
os.environ[ 'OPENAI_LOG_FORMAT' ] = 'stdout,tensorboard'

from stable_baselines import logger
print( 'Configuring stable-baselines logger')
logger.configure()

Full code for reference:
https://github.com/jrjbertram/jsbsim_rl/blob/d65d63fe5e3b4e8ac9be580744b0242ab86eafee/compare.py

from stable-baselines.

araffin avatar araffin commented on July 17, 2024

@jrjbertram thanks for your comment, but I think this issue is more about the new stable-baselines tensorboard logging (used when tensorboard_log is passed), not the legacy one.

from stable-baselines.

RGring avatar RGring commented on July 17, 2024

I would like to save a status of the model, completely stop the training procedure, and continue at a later point (regarding tensorboard curve). Is that possible at the moment? I guess the timestep needs to be saved and reloaded for num_timesteps.

from stable-baselines.

araffin avatar araffin commented on July 17, 2024

To answer your question: yes you can already do that but it won't be perfect when training again after loading.

See issue #301 and documentation: https://stable-baselines.readthedocs.io/en/master/guide/tensorboard.html

from stable-baselines.

Gaoyuan-Liu avatar Gaoyuan-Liu commented on July 17, 2024

Hey @araffin,
If I understand right, this issue has already been solve and added to the main branch. So I follow the instruction in Tensorboard Integration, but whatever I put in the tensorboard_log augment, it will create a new folder and start a new log file for tensorboard.
My code:
model = PPO.load("ppo_panda", env=env, tensorboard_log="./tensorboard/PPO_22")
model.set_env(env)
model.learn(total_timesteps=5000)
model.save("ppo_panda")
In my understanding, it should continue and extend the previous tensorboard file, right? Did I miss any steps?

Thanks!

from stable-baselines.

Miffyli avatar Miffyli commented on July 17, 2024

@Gaoyuan-Liu I do not think there is a solution merged to SB2, but for SB3. I recommend you try migrating over to SB3 as it is more actively supported and comes with additional fixes.

from stable-baselines.

Gaoyuan-Liu avatar Gaoyuan-Liu commented on July 17, 2024

@Miffyli Indeed, I found the function there, thanks!

from stable-baselines.

araffin avatar araffin commented on July 17, 2024

In my understanding, it should continue and extend the previous tensorboard file, right? Did I miss any steps?

if you look at SB2/SB3 doc, you are missing reset_num_timesteps=False.

from stable-baselines.

Gaoyuan-Liu avatar Gaoyuan-Liu commented on July 17, 2024

@araffin True, and I just found even though each time I run model.learn it will create a new folder for tensorboard, which contains new logging data, and the tensorboard plot will be segmented.
But if I manually put the binary file into one folder and run tensorboard, it will plot one continuous line with the data from multiple binary files. So it will be more similar to "the training never stopped".
Thanks!

from stable-baselines.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.