Coder Social home page Coder Social logo

About timesteps about lstm-load-forecasting HOT 3 OPEN

dafrie avatar dafrie commented on September 28, 2024
About timesteps

from lstm-load-forecasting.

Comments (3)

dafrie avatar dafrie commented on September 28, 2024

Hi Yishiheng,
Interesting question! Upfront notice: Since a few years have passed since, take everything with a grain of salt...

You are right that with the timesteps=1 and otherwise the default configuration, the network does not have knowledge about previous data and the whole point of RNN/LSTM would be moot.
However, since the stateful=True option is always set in the models here, the state of the LSTM cell is propagated between batches. As I understood it, this should theoretically allow the network to learn (strong) patterns very far apart without having to explicitly specify the timesteps. References: http://philipperemy.github.io/keras-stateful-lstm/

Please correct me if I am wrong and also would be very interesting to see, whether a different specification with timesteps>1 would improve the predictions!

from lstm-load-forecasting.

Yisiheng avatar Yisiheng commented on September 28, 2024

@dafrie emmmmm,
I think that if stateful=True, the operation does cause the propagation of state of the LSTM cell from the batch to next batch. But ,usually, when we use new batch to train the model. we will reset the state of the LSTM cell. I think that the reason why we reset the state is because the parameters that are trained don't contain the state of LSTM cell. when we use the model to forecast the load, initial state should be zero state (or other default initial state) rather than legacy state in training period.
Excuse me,I don't know if my expression is clear or not.

from lstm-load-forecasting.

dafrie avatar dafrie commented on September 28, 2024

Hm I understand your point, but let me quickly summarise how I saw it (and coded it...):

  • Generate a multivariate dataset for the whole period

  • Train the network with stateful=true and a batch size of 1 (e.g. one point in time)
    --> Do not call reset_state() between each batches, since otherwise the whole LSTM approach here wouldn't make much sense. Reason why not a different input shape (with lagged data for example): I did not want to explicitly model this and I understood that the LSTM cell state could help achieve this long term memory effect I wanted for modelling long-term patterns.

  • Run training for multiple epochs, before each epoch actually calling reset_state()

  • For testing/predictions, actually call reset_state() before prediction and feed in the multivariate data (except the actual load) and let it:

    • Create a forecast for the whole test period at once
    • Create a rolling forecast by retraining the model after each day (24 hours, simulating a real world scenario) but not resetting state each time.

Does that make sense? Since this was only a "pet" project for me and I unfortunately did not have further contact with LSTM, I am quite sure that the approach/architecture of this repository could be done much smarter... Generally I would be really interested to see a comparison with, for example a timesteps>1 and with stateful=False...

from lstm-load-forecasting.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.