I propose arguments for value

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

So something like <div class="highlight highlight-source-julia notranslate positio

Better named for subsets of variables about soss.jl HOT 14 CLOSED

cscherrer commented on May 18, 2024

Better named for subsets of variables

from soss.jl.

Comments (14)

sethaxen commented on May 18, 2024 1

Maybe we should just create a new German word or something.

from soss.jl.

sethaxen commented on May 18, 2024 1

For reference, here are some not-very-thorough-and-potentially-wrong mappings to other packages:

arguments
- Stan: some data
- PyMC3: some named_vars
- ArviZ: constant_data variables
assigned for LHS of Assign (=) statements
- Stan: transformed data and transformed parameters
- PyMC3: deterministics
- ArviZ: posterior/prior variables
sampled for LHS of Sample (~) statements
- Stan: parameters and some data
- PyMC3: free_RVs and observed_RVs
- ArviZ: posterior/prior/-predictive/observed_data variables
parameters(m) = union(assigned(m), sampled(m))
- Stan: some data, transformed data, transformed parameters, parameters
- PyMC3: some named_vars
variables(m) = union(arguments(m), parameters(m))
- Stan: data, transformed data, transformed parameters, parameters
- PyMC3: named_vars

That's a mess. I'm fine with the variable subsets you've given. I see arguments and parameters probably being the most used.

from soss.jl.

cscherrer commented on May 18, 2024 1

Great idea, thank you Seth!

I think we're done with the "better named subsets" issue, so I'll close this

from soss.jl.

sethaxen commented on May 18, 2024

My first reaction is that I like all of these except maybe the last 2. The Soss perspective is still new for me.

What is the intuition behind all LHS being "parameters"? And in what sense are the arguments variable? As I understand it, they're variable in the sense that different arguments can be passed, but you're using the word "arguments" for that, not "variable". In the context of that specific model, they're fixed, not variable. My intuition is that "parameters" and "variable" make more sense swapped, because (nearly) everything on the LHS will follow a distribution, and the term "parameters" is so generic and hence could describe anything in the model.

One more category of interest might be "observations" or "fixed", something that would normally be dependent on other random variables but due to conditioning is fixed. e.g. observations. But I don't know if observations in Soss are ever handled differently from any other parameter.

from soss.jl.

cscherrer commented on May 18, 2024

"variables" is always a weird term, because mathematically it's often used to mean "a value bound to a name" more than "a thing that changes". Haskell and SSA languages still talk about variables ;)

In statistical analysis, its really common to talk about estimation or inference on parameters, or sampling from the posterior distribution over the parameters.

Observations in the usual sense aren't really a thing in Soss, or maybe more accurately some values might be observed, but we don't know ahead of time which those might be.

Yeah, naming things is hard

from soss.jl.

cscherrer commented on May 18, 2024

@sethaxen

julia> m = @model σ2 begin
       σ = sqrt(σ2)
       x ~ Normal(0,σ)
       end;

julia> truth = rand(m(σ2 = 100))
(σ2 = 100, σ = 10.0, x = 9.972715385552725)

julia> arguments(m,truth)
(σ2 = 100,)

julia> assigned(m,truth)
(σ = 10.0,)

julia> sampled(m,truth)
(x = 9.972715385552725,)

julia> parameters(m,truth)
(σ = 10.0, x = 9.972715385552725)

julia> variables(m,truth)
(σ2 = 100, σ = 10.0, x = 9.972715385552725)

from soss.jl.

sethaxen commented on May 18, 2024

Awesome! There seems to be some issue showing a model in the REPL once arguments are passed to it.

This motivates some nice convenience functions:

using Soss, NamedTupleTools

function sample_posterior_predictive(
    m::Soss.JointDistribution,
    observations::NamedTuple,
    posterior::AbstractVector{<:NamedTuple}
)
    arg_keys = arguments(m.model)
    obs_keys = keys(observations)
    var_keys = (setdiff(parameters(m.model), obs_keys)...,)
    nonobs_keys = (var_keys..., arg_keys...)

    pred = predictive(m.model, var_keys...)
    post_pred = map(posterior) do draw
        delete(rand(pred(merge(draw, m.args))), nonobs_keys)
    end

    return particles(post_pred)
end

function sample_prior_with_predictive(
    m::Soss.JointDistribution,
    observations::NamedTuple,
    N::Int
)
    obs_keys = keys(observations)
    var_keys = (setdiff(parameters(m.model), obs_keys)...,)

    prior_with_pred = parameters(m.model, particles(m, N))
    prior = select(prior_with_pred, var_keys)
    prior_pred = delete(prior_with_pred, var_keys)

    return prior, prior_pred
end

This makes it really easy to get ArviZ groups. e.g.

julia> using Random; Random.seed!(42);

julia> model = Soss.@model (J, σ) begin
           μ ~ Normal(0, 5)
           τ ~ HalfCauchy(5)
           θ ~ Normal(μ, τ) |> iid(J)
           y ~ For(1:J) do j
               Normal(θ[j], σ[j])
           end
       end;

julia> nsamples = 1000

julia> obs = (y = [28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0],);

julia> args = (J = 8, σ = [15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]);

julia> post = dynamicHMC(param_model, obs, logpdf, nsamples);

julia> prior, prior_pred = sample_prior_with_predictive(param_model, obs, length(post));

julia> post_pred = sample_posterior_predictive(param_model, obs, post);

julia> (posterior = particles(post), posterior_predictive = post_pred, prior = prior, prior_predictive = prior_pred, observed_data = obs, constant_data = args)
(posterior = (τ = 3.7 ± 3.2, μ = 4.15 ± 3.1, θ = Particles{Float64,1000}[6.08 ± 5.6, 4.62 ± 4.5, 3.67 ± 5.3, 4.52 ± 5.0, 3.39 ± 4.3, 3.91 ± 5.0, 5.93 ± 4.9, 4.82 ± 5.0]), posterior_predictive = (y = Particles{Float64,1000}[6.37 ± 16.0, 4.59 ± 11.0, 4.01 ± 17.0, 4.57 ± 12.0, 3.17 ± 10.0, 4.02 ± 12.0, 6.25 ± 12.0, 4.78 ± 19.0],), prior = (μ = 3.36e-16 ± 5.0, τ = 26.8 ± 220.0, θ = Particles{Float64,1000}[2.33 ± 170.0, -9.06 ± 240.0, -7.05 ± 220.0, -6.44 ± 170.0, -9.12 ± 280.0, -0.988 ± 100.0, 6.06 ± 160.0, -0.128 ± 190.0]), prior_predictive = (y = Particles{Float64,1000}[2.33 ± 170.0, -9.06 ± 240.0, -7.05 ± 220.0, -6.44 ± 170.0, -9.12 ± 280.0, -0.988 ± 100.0, 6.06 ± 160.0, -0.128 ± 190.0],), observed_data = (y = [28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0],), constant_data = (J = 8, σ = [15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]))

Would it be useful to have convenience functions like these in Soss? Do you think this will be a common workflow, or should we make these more general?

from soss.jl.

cscherrer commented on May 18, 2024

I really like the idea of having some helper functions for this sort of thing, but it feels awkward to me to pass in the observations. After inference, we'll have values for the sampled parameters. What do you think of adding methods like

predictive(::JointDistribution, posterior::Vector{NamedTuple}) 
predictive(::JointDistribution, posterior::NamedTuple) # for particles-based approach

from soss.jl.

cscherrer commented on May 18, 2024

So something like

function predict(d::Soss.JointDistribution, post::Vector{NamedTuple{N,T}}) where {N,T}
    args = d.args
    m = d.model
    pred = predictive(m, keys(post[1])...)
    map(nt -> rand(pred(merge(args,nt))), post)
end

julia> particles(param_model) |> pairs
pairs(::NamedTuple) with 6 entries:
  :J => 8
  :σ => [15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]
  :τ => 26.8 ± 220.0
  :μ => 4.43e-16 ± 5.0
  :θ => Particles{Float64,1000}[9.4 ± 250.0, -5.75 ± 150.0, -1.61 ± 57.0, -9.3 ± 2…
  :y => Particles{Float64,1000}[9.4 ± 250.0, -5.75 ± 150.0, -1.61 ± 59.0, -9.3 ± 2…

julia> predict(param_model, post) |> particles |> pairs
pairs(::NamedTuple) with 4 entries:
  :J => 8.0 ± 0.0
  :σ => Particles{Float64,1000}[15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]
  :θ => Particles{Float64,1000}[5.98 ± 5.9, 4.33 ± 4.9, 3.18 ± 5.0, 4.4 ± 5.1, 2.7…
  :y => Particles{Float64,1000}[6.86 ± 16.0, 4.35 ± 11.0, 2.78 ± 17.0, 4.49 ± 12.0…

from soss.jl.

sethaxen commented on May 18, 2024

I like this implementation of predict. It's weird that the arguments would end up as particles, but I guess the version that takes in particles could avoid that. What would that version look like?

And since this is so general, the variable doesn't need to be post and could be draws or sample or something.

from soss.jl.

cscherrer commented on May 18, 2024

Good point, I think we could just merge back with the original arguments to avoid constant particles

from soss.jl.

cscherrer commented on May 18, 2024

Oh wait, maybe that's not right. The particles is outside the function

from soss.jl.

cscherrer commented on May 18, 2024

Oh I see, you mean the one that takes particles as an argument. This is in dev, but particles version isn't yet working correctly

from soss.jl.

sethaxen commented on May 18, 2024

Oh you know what, it would be good for these to be implemented with the first argument as an AbstractRNG and that have these versions forward. So

function predict(rng::AbstractRNG, d::JointDistribution, post::Vector{NamedTuple{N,T}}) where {N,T}
    args = d.args
    m = d.model
    pred = predictive(m, keys(post[1])...)
    map(nt -> rand(rng, pred(merge(args,nt))), post)
end

function predict(rng::AbstractRNG, m::Model, post::Vector{NamedTuple{N,T}}) where {N,T}
    pred = predictive(m, keys(post[1])...)
    map(nt -> rand(rng, pred(nt)), post)
end

predict(args...; kwargs...) = predict(Random.GLOBAL_RNG, args...; kwargs...)

I'm planning to look through and see how we can better support user-provided RNGs. Looks complicated but will make testing a lot easier.

from soss.jl.

Better named for subsets of variables about soss.jl HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent