Scenario: Using an LSTM as summary network for uni-/multivariate time series data, wit

Guidelines on number of output and hidden units in LSTM about bayesflow HOT 4 CLOSED

stefanradev93 commented on September 3, 2024

Guidelines on number of output and hidden units in LSTM

from bayesflow.

Comments (4)

stefanradev93 commented on September 3, 2024 1

Hi Yannik,

assuming the posterior is nice (i.e., identifiable, no degeneracy, no dependencies), I would go with you and say that summary_dim should be at least the number of free model parameters. However, you may want to use more summary dimensions to give the networks more degrees of freedom during optimization.

We also have some hints that overcomplete summary spaces may lead to more fragile inference in the presence of outliers (i.e., model misspecification): https://arxiv.org/abs/2112.08866. Using the additional MMD criterion, you can also apply PCA to the summary network outputs after optimization to see whether you have used more summary stats than necessary (Fig. 10) or how the summary stats relate to the parameters (Fig. 9).

Regarding lstm_units, I would keep it relatively large, since it contains the representation of the whole time series, as you have indicated.

Let me know what works for you or if you need more details.

Cheers,
Stefan

from bayesflow.

stefanradev93 commented on September 3, 2024 1

I guess the optimal choice for lstm_units will depend on the complexity of the time series and the necessary information to be extracted for posterior inference (e.g., simple moments vs. multi-scale oscillatory parameters). For the time series models I have worked with, values between 128 and 512 used to produce rather robust results, with larger values not hurting performance, but I am not certain how well my experience generalizes to unknown models. It is definitely a parameter to juice up if initial results do not look splendid!

Note: For really long time series, attention-based architectures may be a better choice than recurrent nets altogether.

from bayesflow.

yannikschaelte commented on September 3, 2024

Hi Stefan,

thanks for the information! The discussion on the summary_dim makes perfectly sense. We may want to incorporate the PCA checks.

Do you have any more specific tips for lstm_units, have you explored values other than 128? I guess it should be at least on the order of the (expected/maximum) number of data points? But maybe just something to experiment with a bit.

Thanks,
Yannik

from bayesflow.

yannikschaelte commented on September 3, 2024

Thanks, the suggested answers worked for us. We will try transformers soon for sure. Closing here!

from bayesflow.

Recommend Projects

Guidelines on number of output and hidden units in LSTM about bayesflow HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent