Coder Social home page Coder Social logo

Comments (6)

florianhartig avatar florianhartig commented on August 20, 2024

It would be easy to fix this, simply adding multiplying mcmcOutput$settings$thin to start / end / thin values in the return of getSample.

However, then the question is - if we do this, the "real" iteration numbers will be displayed in the plot, e.g. we have a sampler that has been thinned by 10 during sampling, so that we have 10.0000 samples, but the plot displays values up to 100.000 , and the user now looks at this and requests a start values of 10.000, then we should also divide inputs in (start, end, thin) in getSample by mcmcOutput$settings$thin and so on, which could get rather confusing, because according to this logic, a request of thin = 10 in getSample would have no effect, because the sample is already thinned at 10 ... or is this actually only confusing if one is not used to it?

In general, I was wondering if we should move towards using row names or an extra column for recording the iteration of the MCMC sample, the whole business with having several opportunities for thinning is getting rather confusing.

@stefan-paul , what do you think?

from bayesiantools.

stefan-paul avatar stefan-paul commented on August 20, 2024

Hmm tricky question.
I would prefer having the real iteration number displayed
in the plot even for a thinned sample. I think the confusing part is
mostly in the functions so the user doesn't see this. I guess we only have to take care
that we're consistent here. And maybe we should add a line or two in the help page.

For the general question I'm still a bit undecided what the best way would be to handle
the thinning options.... I'll think on it .

from bayesiantools.

florianhartig avatar florianhartig commented on August 20, 2024

Max, Tankred, opinions?

from bayesiantools.

MaximilianPi avatar MaximilianPi commented on August 20, 2024

On the long run we should move away from coda and just provide a toCoda function. Regarding the thinning, displaying the real iterations should avoid confusion and might be more intuitive.

A solution would be to give the coda::as.mcmc.list function the start, end, and thin values (at the end of the samplers). This way the real iterations should be always displayed and we could additionally thin out with the plot function through getSample. The start/end arguments passed to getSample have to be divided then by the settings$thin as you mentioned above @florianhartig.

from bayesiantools.

TankredO avatar TankredO commented on August 20, 2024

I think getting the real iteration number displayed make more sense.

Maybe we should introduce a "bayesianSample" object or something like this as output of getSample which has all this information?

from bayesiantools.

florianhartig avatar florianhartig commented on August 20, 2024

I don't think an own object, or moving away from coda, is really an option, as a lot of coda objects are used internally. We had a #5 a while ago to actually move away from our own internal structure and do everything as coda from the start.

OK, the way I see it

  1. We should have the correct iteration numbers displayed, so this should be fixed. For this issue, the only question is how to do this technically, i.e. either we just have to take care internally to always record start and thin, and display accordingly, or we should actually introduce a column for the iteration number in the sample (more storage space, possibly more versatile because we could also have non-even thinning). Regarding the latter - it should be noted that coda can't handle non-evenly spaced thinning anyway, so if we move to coda, we have to discard this info.

--> I think my preference would be to simply introduce start, end, thin argument with the internal MCMC chain, to have this compliant with coda

  1. The more tricky question is how getSample should now handle a chain that is already thinned. I guess the most stringent way would be to take over start, end, etc. arguments from the previous chain, so if we have a sampler that starts at 1000, is thinned with 10, getSample with
  • start = 1000, thin = 10 will do nothing
  • start = 0 will throw an error / warning
  • thin = 5 will throw an error / warning
  • thin = 20 will thin out the existing chain by 2
  • thin = 13 will round do 10, with a message that rounding had to be performed

We will have to take care that this all works smoothly also with the numSamples argument, which is frequently used in the plots.

Alternative would be to make the getSample work relative to the current chain, i.e. thin = 2 thins the existing chain by 2, regardless of the original thinning interval

I'm not sure, both options are possibly confusing to a user. How does coda handle all this? They also allow thinning, right?

from bayesiantools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.