Comments (3)
Hi Yishiheng,
Interesting question! Upfront notice: Since a few years have passed since, take everything with a grain of salt...
You are right that with the timesteps=1
and otherwise the default configuration, the network does not have knowledge about previous data and the whole point of RNN/LSTM would be moot.
However, since the stateful=True
option is always set in the models here, the state of the LSTM cell is propagated between batches. As I understood it, this should theoretically allow the network to learn (strong) patterns very far apart without having to explicitly specify the timesteps. References: http://philipperemy.github.io/keras-stateful-lstm/
Please correct me if I am wrong and also would be very interesting to see, whether a different specification with timesteps>1
would improve the predictions!
from lstm-load-forecasting.
@dafrie emmmmm,
I think that if stateful=True, the operation does cause the propagation of state of the LSTM cell from the batch to next batch. But ,usually, when we use new batch to train the model. we will reset the state of the LSTM cell. I think that the reason why we reset the state is because the parameters that are trained don't contain the state of LSTM cell. when we use the model to forecast the load, initial state should be zero state (or other default initial state) rather than legacy state in training period.
Excuse me,I don't know if my expression is clear or not.
from lstm-load-forecasting.
Hm I understand your point, but let me quickly summarise how I saw it (and coded it...):
-
Generate a multivariate dataset for the whole period
-
Train the network with
stateful=true
and a batch size of 1 (e.g. one point in time)
--> Do not callreset_state()
between each batches, since otherwise the whole LSTM approach here wouldn't make much sense. Reason why not a different input shape (with lagged data for example): I did not want to explicitly model this and I understood that the LSTM cell state could help achieve this long term memory effect I wanted for modelling long-term patterns. -
Run training for multiple epochs, before each epoch actually calling
reset_state()
-
For testing/predictions, actually call
reset_state()
before prediction and feed in the multivariate data (except the actual load) and let it:- Create a forecast for the whole test period at once
- Create a rolling forecast by retraining the model after each day (24 hours, simulating a real world scenario) but not resetting state each time.
Does that make sense? Since this was only a "pet" project for me and I unfortunately did not have further contact with LSTM, I am quite sure that the approach/architecture of this repository could be done much smarter... Generally I would be really interested to see a comparison with, for example a timesteps>1
and with stateful=False
...
from lstm-load-forecasting.
Related Issues (3)
- Scaling Query HOT 2
- weather_i HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lstm-load-forecasting.