Coder Social home page Coder Social logo

Comments (11)

dfm avatar dfm commented on July 18, 2024 3

I have seen this before, and I don't totally understand why this happens. In the short term, I'd recommend using the groups argument to to_netcdf (care of @avivajpeyi):

trace.to_netcdf(
    filename=...,
    groups=["posterior", "log_likelihood", "sample_stats"],
)

Here's a simpler snippet that fails with the same issue:

import pymc3 as pm
import exoplanet as xo

with pm.Model() as model:
    ecc = pm.Uniform("ecc")
    xo.eccentricity.kipping13("ecc_prior", fixed=True, observed=ecc)
    trace = pm.sample(return_inferencedata=True)
    trace.to_netcdf("test")

This could be used to debug and find a longer term solution.

from exoplanet.

dgegen avatar dgegen commented on July 18, 2024 2

Somewhat off-topic:
Well, thank you very much for your great work. For tackling scientific problems, I find it always very pleasant to use higher-level abstraction languages like Python while still having the advantage of high performance of lower-level languages like C++. Accordingly, I am grateful that packages like celerite2 exist. I thought a few times that in the long run it might be worthwhile to consider translating exoplanet to Julia, thus, potentially avoiding the dependence on various other languages that is currently required which might increase performance. Needless to say that such an undertaking is of course also very labour-intensive and would take quite some time to realise.

On-topic:
I now asked in the PyMC forum here how this issue is usually handled and will let you know in case anyone comes up with a neat solution.

from exoplanet.

dfm avatar dfm commented on July 18, 2024 2

Thanks!

If you're interested in a Julia version, there has been some work on implementing at least some parts over in: https://github.com/JuliaAstro/Transits.jl
And we have this old implementation of celerite in Julia (I'm not sure how compatible it'll be with recent versions of Julia): https://github.com/ericagol/celerite.jl

from exoplanet.

ericagol avatar ericagol commented on July 18, 2024 2

If you're interested in a Julia version, there has been some work on implementing at least some parts over in: https://github.com/JuliaAstro/Transits.jl

Also https://github.com/rodluger/Limbdark.jl

from exoplanet.

dgegen avatar dgegen commented on July 18, 2024

@dfm Thank you very much for the quick reply, the boiled down version and the workaround. I would still be curious to know if trace.observed_data can also be saved / how the custom exoplanet distributions would need to be changed.

from exoplanet.

dfm avatar dfm commented on July 18, 2024

The issue is somehow related to the fact that observed_data is required to be fixed (i.e. the same for every sample), but here we're overloading the obs argument to be "tensor" that depends on the parameters. We could hack it to use pm.Potential or something, but then it would no longer work as a prior:

ecc = xo.eccentricity.kipping13("ecc_prior", fixed=True)

Perhaps there's some other PyMC3 trickery that we could use, but I don't know what it would look like!

from exoplanet.

dgegen avatar dgegen commented on July 18, 2024

I see. I played around some more and found that the problem of xarray serialization is not limited to the custom distributions of exoplanet but also occurs for celerite2.theano.GaussianProcess objects under certain circumstance:

import numpy as np
import exoplanet as xo
import pymc3 as pm
import aesara_theano_fallback.tensor as tt
from celerite2.theano import terms, GaussianProcess

x = np.linspace(-1, 1, 1000)
texp = x[1]-x[0]
true_orbit = xo.orbits.KeplerianOrbit(period=4, t0=0)
y = xo.LimbDarkLightCurve([0, 0]).get_light_curve(orbit=true_orbit, t=x, r= 0.1, texp=texp).eval()
y = y.reshape(-1) + np.random.normal(loc=0.0, scale=0.001, size=len(x))

with pm.Model() as model:
    t0 = pm.Normal("t0", mu=0.1, sd=0.5)
    orbit = xo.orbits.KeplerianOrbit(period=3.99, t0=t0)

    light_curves = xo.LimbDarkLightCurve([0, 0]).get_light_curve(orbit=orbit, r=0.1, t=x, texp=texp)
    resid = y - tt.sum(light_curves, axis=-1) 

    kernel = terms.SHOTerm(sigma=np.std(y), rho=np.std(y), Q=1/np.sqrt(2))
    gp = GaussianProcess(kernel, t=x, yerr=np.std(y))
    gp.marginal("gp", observed=resid)  # no error if resid is replaced by y

    # Sample
    trace = pm.sample(return_inferencedata=True, tune=100, draws=200)
    trace.to_netcdf("test")

raises the error

ValueError: unable to infer dtype on variable 'gp'; xarray cannot serialize arbitrary Python objects

However, there is no error if gp.marginal("gp", observed=resid) is replaced with gp.marginal("gp", observed=y) , i.e. we marginalise on a quantity that is independent of any random variables.

from exoplanet.

dfm avatar dfm commented on July 18, 2024

Yes. This is exactly the same issue. The "observed data" isn't a constant if it depends on the model!

from exoplanet.

dgegen avatar dgegen commented on July 18, 2024

You are right of course, I just wanted to point out that the problem is not limited to the custom distribution of exoplanet.

from exoplanet.

dfm avatar dfm commented on July 18, 2024

Haha good point. The celerite2 design (mistakes?) are the same and also my own :D

from exoplanet.

dgegen avatar dgegen commented on July 18, 2024

Thanks for pointing out these projects, I will have a look at them!

from exoplanet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.