Coder Social home page Coder Social logo

Comments (8)

juliangehring avatar juliangehring commented on May 30, 2024

I assume that you would want to keep the original pairing between x and y in your use case? I.e. if you pick x[i] you also want to pick y[i] in a sampling step?

from bootstrap.jl.

juliangehring avatar juliangehring commented on May 30, 2024

If the data is an Array, the implementation will treat the rows as observations and the columns as variables; analogous to a DataFrame. The bootstrap step will sample different observations while keeping the relation between the variables intact. In other words, it will generate a new data set with data[idx,:] for each sampling.
You could do the same with a DataFrame where the role of observations and variables is bit clearer as in a plain array. But both will behave the same.

from bootstrap.jl.

juliangehring avatar juliangehring commented on May 30, 2024

In your case, the package should work as you seem to expect. Let me know if my explanations make sense or if you anything isn't clear.

from bootstrap.jl.

drbenvincent avatar drbenvincent commented on May 30, 2024

Thanks for the quick reply.

I assume that you would want to keep the original pairing between x and y in your use case? I.e. if you pick x[i] you also want to pick y[i] in a sampling step?

Yes

If the data is an Array, the implementation will treat the rows as observations and the columns as variables; analogous to a DataFrame. The bootstrap step will sample different observations while keeping the relation between the variables intact. In other words, it will generate a new data set with data[idx,:] for each sampling.

Sounds like this is doing what I wanted it to

In your case, the package should work as you seem to expect. Let me know if my explanations make sense or if you anything isn't clear.

I guess I am a bit confused about why there were 4 output variables rather than just 1 for the bootstrap correlation values. In this case the variables are uncorrelated, what I was after overall was the 95% CI's on the correlation coefficient. So am expecting some range overlapping zero (variables are independent/uncorrelated in this example), but was unsure how to interpret the output.

from bootstrap.jl.

juliangehring avatar juliangehring commented on May 30, 2024

I guess I am a bit confused about why there were 4 output variables rather than just 1 for the bootstrap correlation values. In this case the variables are uncorrelated, what I was after overall was the 95% CI's on the correlation coefficient. So am expecting some range overlapping zero (variables are independent/uncorrelated in this example), but was unsure how to interpret the output.

The Statistics.cor function computes a 2x2 correlation matrix for your input array:

2×2 Array{Float64,2}:
 1.0       0.192743
 0.192743  1.0

Please note that this is a full correlation matrix and not only a coefficient between both variables.

This is why the output of the bootstrap step also gives you 4 variables, it is simply the output of cor applied on the bootstrapped data set. You are probably only interested in the off-diagonal elements, and could use e.g. x -> cor(x)[2] or x -> cor(x[:,1], x[:,2]) as your statistics function.

The confidence intervals return

(estimate, lower_bound, upper_bound)

for each of the 4 bootstrapped variables. The vector contains the confidence intervals for all 4 variables. In line with what I mentioned above, you would want the 2nd or 3rd element of the vector in this case.

from bootstrap.jl.

drbenvincent avatar drbenvincent commented on May 30, 2024

Perfect. I got confused about the output of Statistics.cor because when I was testing I used it in the form of cor(personality, looks) and just got the scaler output.

So this makes sense, but I think x -> cor(x[1,:], x[2,:]) should be x -> cor(x[:,1], x[:,2]). That gives me:

((0.005388735203249658, -0.22214553137849985, 0.21166247504417582),)

and those numbers make sense for 100 uncorrelated values. Not sure why the extra empty element though. But that solves my question. Thanks very much!

from bootstrap.jl.

juliangehring avatar juliangehring commented on May 30, 2024

Yes, the indices should be other way around. I have updated the comment above to fix this.

The "empty element" is just julia's way of displaying a tuple with only one element.

from bootstrap.jl.

drbenvincent avatar drbenvincent commented on May 30, 2024

Maybe this issue will be useful documentation if anyone else has a similar issue. Thanks for the help.

from bootstrap.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.