Coder Social home page Coder Social logo

Comments (4)

stefanradev93 avatar stefanradev93 commented on September 3, 2024 1

Hi Yannik,

assuming the posterior is nice (i.e., identifiable, no degeneracy, no dependencies), I would go with you and say that summary_dim should be at least the number of free model parameters. However, you may want to use more summary dimensions to give the networks more degrees of freedom during optimization.

We also have some hints that overcomplete summary spaces may lead to more fragile inference in the presence of outliers (i.e., model misspecification): https://arxiv.org/abs/2112.08866. Using the additional MMD criterion, you can also apply PCA to the summary network outputs after optimization to see whether you have used more summary stats than necessary (Fig. 10) or how the summary stats relate to the parameters (Fig. 9).

Regarding lstm_units, I would keep it relatively large, since it contains the representation of the whole time series, as you have indicated.

Let me know what works for you or if you need more details.

Cheers,
Stefan

from bayesflow.

stefanradev93 avatar stefanradev93 commented on September 3, 2024 1

I guess the optimal choice for lstm_units will depend on the complexity of the time series and the necessary information to be extracted for posterior inference (e.g., simple moments vs. multi-scale oscillatory parameters). For the time series models I have worked with, values between 128 and 512 used to produce rather robust results, with larger values not hurting performance, but I am not certain how well my experience generalizes to unknown models. It is definitely a parameter to juice up if initial results do not look splendid!

Note: For really long time series, attention-based architectures may be a better choice than recurrent nets altogether.

from bayesflow.

yannikschaelte avatar yannikschaelte commented on September 3, 2024

Hi Stefan,

thanks for the information! The discussion on the summary_dim makes perfectly sense. We may want to incorporate the PCA checks.

Do you have any more specific tips for lstm_units, have you explored values other than 128? I guess it should be at least on the order of the (expected/maximum) number of data points? But maybe just something to experiment with a bit.

Thanks,
Yannik

from bayesflow.

yannikschaelte avatar yannikschaelte commented on September 3, 2024

Thanks, the suggested answers worked for us. We will try transformers soon for sure. Closing here!

from bayesflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.