aesara-devs / aeppl Goto Github PK

View Code? Open in Web Editor NEW

63.0 63.0 21.0 754 KB

Tools for an Aesara-based PPL.

Home Page: https://aeppl.readthedocs.io

License: MIT License

Makefile 0.61% Python 99.39%

bayesian-statistics modeling-and-simulation ppl probability statistics symbolic-computation term-rewriting

aeppl's People

Contributors

Stargazers

Watchers

aeppl's Issues

Support mixtures defined with `IfElse`

We need to add support for mixtures defined using the IfElse Op.

The following is a simple example that isn't supported:

import aesara.tensor as at

from aesara.ifelse import ifelse

from aeppl.joint_logprob import factorized_joint_logprob


srng = at.random.RandomStream(seed=2320)

I_rv = srng.bernoulli(0.5, name="I")
X_rv = srng.normal(0, 1, name="X")
Y_rv = srng.gamma(0.5, 0.5, name="Y")

Z_rv = ifelse(I_rv, X_rv, Y_rv)
Z_rv.name = "Z"

z_vv = Z_rv.clone()
i_vv = I_rv.clone()
logp_parts = factorized_joint_logprob({Z_rv: z_vv, I_rv: i_vv})

Docstrings tracking issue

I'm happy to work on docstrings if you find this task to be tedium. 😄

Let me know, @brandonwillard!

AssertionError in aeppl.logp breaks sampling in pymc

Description of your problem or feature request

Please provide a minimal, self-contained, and reproducible example.

aeppl/aeppl/logprob.py

Line 92 in 18275e2

res = Assert("sigma > 0")(res, at.all(at.gt(sigma, 0.0)))

Please provide the full traceback of any errors.

AssertionError: sigma > 0
Apply node that caused the error: Assert{msg='sigma > 0'}(Elemwise{Composite{((i0 + (i1 * sqr(i2))) - log(i3))}}[(0, 2)].0, All.0)

Please provide any additional information below.
Sampling with PyMC (master) fails with assertion errors and does not treat AssertionErorr as divergent sample

Expected Behaviour

return -inf

Possible Solution

provide a graph rewrite for fixing assertions
replace Assert with switch

This snippet solved my issue

aesara.assert_op.Assert = lambda name: (lambda res, *cond: aesara.tensor.switch(
    aesara.tensor.all(aesara.tensor.stack([c.all() for c in cond])), 
    res, 
    -np.inf
))

Versions and main components

Aesara version:
Aesara config (python -c "import aesara; print(aesara.config)")
Python version:
Operating system:
How did you install Aesara: (conda/pip)

Implement Marginalized RVs

Working in pymc-devs/pymc#5438, I realized that the pattern that we need to match for Marginalized RVs is not too bad:

import aesara.tensor as at
from aeppl import joint_logprob

weights_rv = at.random.dirichlet([1, 1], name="weights")
components = at.stack([at.random.normal(name="c1"), at.random.normal(name="c2")], axis=-1)
mix_rv = at.sum(components * weights_rv, axis=-1)
mix_rv.name = "mix"

# weights can also just be a deterministic, only thing that matters is that it adds up to 1
weights_vv = weights_rv.clone() 
mix_vv = mix_rv.clone()

joint_logprob({weights_rv: weights_vv, mix_rv: mix_vv})

Value variable mapping is lost in TransformValuesOpt

import aesara.tensor as at
import aeppl
from aeppl.transforms import TransformValuesOpt, DEFAULT_TRANSFORM

p_rv = at.random.beta(1, 1, name="p")
p_vv = p_rv.clone()

tr = TransformValuesOpt({p_vv: DEFAULT_TRANSFORM})
logp_dict = aeppl.factorized_joint_logprob({p_rv: p_vv}, extra_rewrites=tr)
logp_dict

assert p_vv in logp_dict  # raises

Noted by @kc611

Allow joint_logprob to receive a list of variables

It seems natural that we could compute the logprob graph for several independent variables (e.g, if we have more than one likelihood term, or disconnected "potential" terms).

x_rv = at.random.uniform(0, 1, name="x")
y_rv = at.random.uniform(0, 5, name="y")
z_rv = at.random.uniform(0, 10, name="z")

x = x_rv.clone()
y = y_rv.clone()
z = z_rv.clone()

joint_logprob(
    [x_rv, y_rv, z_rv],
    {x_rv: x, y_rv: y, z_rv: z},
)

However, I am afraid this could lead to double counting issues...

Also, right now we can stack RVs, but I guess this is limited by the shapes of the RVs

joint_logprob(
    at.stack([x_rv, y_rv, z_rv]),
    {x_rv: x, y_rv: y, z_rv: z},
)

HalfNormal should have interval transform

The HalfNormal in aeppl seems to accomodate the loc parameter, which means the default transform should be Interval[loc, None], assuming the loc is just shifting the cutoff point of the HalfNormal from zero.

Stickbreaking transform fails for certain cases

Description of your problem or feature request

This was pointed out during aeppl integration of this particular test

https://github.com/pymc-devs/pymc3/blob/main/pymc3/tests/test_transforms.py#L114

The implementation of Stickbreaking transform concatenates along the default axis instead of last axis (axis = -1).

aeppl/aeppl/transforms.py

Line 277 in 9d9ffb7

value = at.concatenate([value, -at.sum(value, -1, keepdims=True)])

Please provide a minimal, self-contained, and reproducible example.
An MWE can be as follows:

import aesara
import aesara.tensor as at
import numpy as np

value = at.matrix("value")

# Forward
log_value = at.log(value)
shift = at.sum(log_value, -1, keepdims=True) / value.shape[-1]
value = log_value[..., :-1] - shift

# Backward
value = at.concatenate([value, -at.sum(value, -1, keepdims=True)])
exp_value_max = at.exp(value - at.max(value, -1, keepdims=True))
result = exp_value_max / at.sum(exp_value_max, -1, keepdims=True)

result = aesara.function([value],[result])

result(np.array([[0.09,0.01,0.9],[0.09,0.01,0.9]])) # Not sure about the exact values

Please provide the full traceback of any errors.

TypeError: Bad input argument to aesara function with name "/tmp/ipykernel_8640/3915568923.py:1" at index 0 (0-based).
Backtrace when that variable is created:

File "/home/kc611/anaconda3/envs/pymcv4/lib/python3.7/site-packages/ipykernel/zmqshell.py", line 532, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/home/kc611/anaconda3/envs/pymcv4/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2899, in run_cell
raw_cell, store_history, silent, shell_futures)
File "/home/kc611/anaconda3/envs/pymcv4/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2944, in _run_cell
return runner(coro)
File "/home/kc611/anaconda3/envs/pymcv4/lib/python3.7/site-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
coro.send(None)
File "/home/kc611/anaconda3/envs/pymcv4/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3170, in run_cell_async
interactivity=interactivity, compiler=compiler, result=result)
File "/home/kc611/anaconda3/envs/pymcv4/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3361, in run_ast_nodes
if (await self.run_code(code, result, async_=asy)):
File "/home/kc611/anaconda3/envs/pymcv4/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/tmp/ipykernel_8640/4089953573.py", line 9, in <module>
value = at.concatenate([value, -at.sum(value, -1, keepdims=True)])
Non-unit value on shape on a broadcastable dimension.

Please provide any additional information below.

Whereas concatenating along the last axis we get an implementation that works

import aesara
import aesara.tensor as at
import numpy as np

value = at.matrix("value")

# Forward
log_value = at.log(value)
shift = at.sum(log_value, -1, keepdims=True) / value.shape[-1]
value = log_value[..., :-1] - shift

# Backward
value = at.concatenate([value, -at.sum(value, -1, keepdims=True)], axis = -1) # Concatenating along axis = -1
exp_value_max = at.exp(value - at.max(value, -1, keepdims=True))
result = exp_value_max / at.sum(exp_value_max, -1, keepdims=True)

result = aesara.function([value],[result])

result(np.array([[0.09,0.01,0.9],[0.09,0.01,0.9]])) # Not sure about the exact values
# array([[0.23974926, 0.22131646, 0.53893428],
#        [0.23974926, 0.22131646, 0.53893428]])

cc @ricardoV94 (Not sure which one of you guys pointed this out)

Implement Truncated RVs

The logprob graph for truncated distributions is straightforward, not very different from that of censored distributions that we have already implemented.

However we need a new Op / graph to represent / generate truncated draws from an arbitrary RV.

Symbolically we can take truncated draws from any RV by doing rejection sampling untill all desired draws are in the range lower/upper. This however can be incredibly slow if the truncation interval corresponds to a very small mass of the original distribution.

If we have an inverse cdf function we can more easily obtain random draws by drawing from a uniform distribution in the truncation range and taking the inverse cdf of those points.

Finally we might have specialized Ops, graphs that could be dispatched to an RV (for instance if one has already implemented a TruncatedNormal based on the scipy distribution)

I think we could use a hierarchical dispatch strategy to go from the most to least specialized forms, and perhaps wrap the result in an OpFromGraph for easy logprob parsing. Here is a pseudo-code suggestion of how this could be implemented;

def truncate(rv, lower, upper):
  # Try to dispatch on specific graph/Op
  try:
    truncated = _truncate(rv)
  except NotImplementedError:

  # Try to use [i]cdf if they are implemented for given RV
    try: 
      icdf = _icdf(rv)
      cdf_lower = at.exp(_logcdf(rv, lower))
      cdf_upper = at.exp(_logcdf(rv, upper))
      uniform = at.random.uniform(cdf_lower, cdf_upper, size=rv.size)
      truncated = icdf(uniform)
    except NotImplementedError:
    
      # Default to slow while scan graph
      ...

  # Wrap truncated in a custom OpFromGraph that can be easily parsed for the logprob component?
  ...

This is just a quick sketch, happy to hear other ideas.

Remove use of `.tag.ignore_logprob` in `aeppl.mixture.mixture_replace`

The entire "ignore_logprob" feature should really be removed, or, if anything, reimplemented in some non-tag-based way. In order to do this, we need to do something about its use in mixture_replace, though.

Originally posted by @brandonwillard in #72 (comment)

Create a conda-forge recipe

We have a PR open for an AePPL conda-forge recipe here: conda-forge/staged-recipes#15766.

`joint_logprob` clones non-"atomic" value variables

Consider the following scenario in which a value variable is the output of a graph (i.e. non-"atomic"):

import aesara
import aesara.tensor as at

from aeppl import joint_logprob


Y_rv = at.random.normal(0, 1, size=(2,), name="Y")

y_vv = Y_rv.clone()
y_vv.name = "y"

y_vv_2 = 2 * y_vv
y_vv_2.name = "y_2"

logprob_non_atomic = joint_logprob({Y_rv: y_vv_2})

aesara.dprint([logprob_non_atomic, y_vv_2])
# Sum{acc_dtype=float64} [id A] ''
#  |Assert{msg='sigma > 0'} [id B] 'y_2_logprob'
#    |Elemwise{sub,no_inplace} [id C] ''
#    | |Elemwise{sub,no_inplace} [id D] ''
#    | | |Elemwise{mul,no_inplace} [id E] ''
#    | | | |InplaceDimShuffle{x} [id F] ''
#    | | | | |TensorConstant{-0.5} [id G]
#    | | | |Elemwise{pow,no_inplace} [id H] ''
#    | | |   |Elemwise{true_div,no_inplace} [id I] ''
#    | | |   | |Elemwise{sub,no_inplace} [id J] ''
#    | | |   | | |Elemwise{mul,no_inplace} [id K] 'y_2'
#    | | |   | | | |InplaceDimShuffle{x} [id L] ''
#    | | |   | | | | |TensorConstant{2} [id M]
#    | | |   | | | |y [id N]
#    | | |   | | |InplaceDimShuffle{x} [id O] ''
#    | | |   | |   |TensorConstant{0} [id P]
#    | | |   | |InplaceDimShuffle{x} [id Q] ''
#    | | |   |   |TensorConstant{1} [id R]
#    | | |   |InplaceDimShuffle{x} [id S] ''
#    | | |     |TensorConstant{2} [id T]
#    | | |InplaceDimShuffle{x} [id U] ''
#    | |   |Elemwise{log,no_inplace} [id V] ''
#    | |     |TensorConstant{2.5066282746310002} [id W]
#    | |InplaceDimShuffle{x} [id X] ''
#    |   |Elemwise{log,no_inplace} [id Y] ''
#    |     |TensorConstant{1} [id R]
#    |All [id Z] ''
#      |Elemwise{gt,no_inplace} [id BA] ''
#        |TensorConstant{1} [id R]
#        |TensorConstant{0.0} [id BB]
# Elemwise{mul,no_inplace} [id BC] 'y_2'
#  |InplaceDimShuffle{x} [id BD] ''
#  | |TensorConstant{2} [id M]
#  |y [id N]

The graph output demonstrates that the non-"atomic" y_vv_2 has been cloned in the joint_logprob output (i.e. the IDs of the y_vv_2 term and the equivalent sub-graph in logprob_non_atomic are not equal).

It looks like y_vv_2 is being cloned when rvs_to_value_vars is called here.

While this behavior isn't necessarily a bug, it can be confusing and sometimes complicate things for callers of joint_logprob (e.g. aesara-devs/aehmc#25 (comment)), so we should prevent this cloning whenever possible.

Allow transforms to work with `OpFromGraph`

Right now transforms expect RandomVariable inputs and fail otherwise. It would be useful if they were more flexible and allowed, for instance, RVs / MeasurableVariables defined via OpFromGraph.

Here is an example that currently fails with MixtureRVs:

import aesara.tensor as at
from aeppl import joint_logprob
from aeppl.transforms import TransformValuesOpt, LogOddsTransform

idx_rv = at.random.bernoulli(0.5, name='idx')
mixture1 = at.random.beta(100, 1, name='mixture_comp1')
mixture2 = at.random.beta(1, 100, name='mixture_comp2')
mixture_rv = at.stack(mixture1, mixture2)[idx_rv]  # MixtureRV is artificially created via `at.stack` to trigger `OpFromGraph`
mixture_rv.name = 'mixture'

idx_vv = idx_rv.clone()
mixture_vv = mixture_rv.clone()

transform_opt = TransformValuesOpt({mixture_vv: LogOddsTransform()})
logp = joint_logprob(
    {mixture_rv: mixture_vv, idx_rv: idx_vv},
    extra_rewrites=transform_opt,
)

ERROR (aesara.graph.opt): Optimization failure due to: transform_values
ERROR (aesara.graph.opt): node: mixture-mixture{inline=True}(ScalarFromTensor.0, mixture_comp1, mixture_comp2, RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F3233008E40>), RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F3232C05040>), RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F3232BA4820>))
ERROR (aesara.graph.opt): TRACEBACK:
ERROR (aesara.graph.opt): Traceback (most recent call last):
  File "/home/ricardo/Documents/Projects/aeppl/venv/lib/python3.8/site-packages/aesara/graph/opt.py", line 2025, in process_node
    replacements = lopt.transform(fgraph, node)
  File "/home/ricardo/Documents/Projects/aeppl/venv/lib/python3.8/site-packages/aesara/graph/opt.py", line 1187, in transform
    return self.fn(*args, **kwargs)
  File "/home/ricardo/Documents/Projects/aeppl/aeppl/transforms.py", line 140, in transform_values
    new_op = _create_transformed_rv_op(node.op, transform)()
  File "/home/ricardo/Documents/Projects/aeppl/venv/lib/python3.8/site-packages/aesara/tensor/random/op.py", line 124, in __init__
    self.name = name or getattr(self, "name")
AttributeError: 'TransformedMixtureRV' object has no attribute 'name'

logp.eval({idx_vv: 0, mixture_vv: 10}) 
# array(-inf)

In contrast this works:

idx_rv = at.random.bernoulli(0.5, name='idx')
mixture = at.random.beta([100, 1], [1, 100], name='mixture_components')
mixture_rv = mixture[idx_rv]
mixture_rv.name = 'mixture'

idx_vv = idx_rv.clone()
mixture_vv = mixture_rv.clone()

transform_opt = TransformValuesOpt({mixture_vv: LogOddsTransform()})
logp = joint_logprob(
    {mixture_rv: mixture_vv, idx_rv: idx_vv},
    extra_rewrites=transform_opt,
)

logp.eval({idx_vv: 0, mixture_vv: 10})
# array(-6.09256229)

There is nothing special about transform rewrite that requires RandomVariable input, since they work on the value variables and not on the respective RVs, except for the base logprob, which by definition is available for any MeasurableVariable.

This cropped up while working in #22, even though value transforms are pretty useless there.

joint_logprob is failing when RandomVariable is not given a value

import aesara.tensor as at
import aeppl

x_rv = at.random.normal(name='x')
y_rv = at.random.normal(x_rv, 1, name="y")

y_vv = y_rv.clone()

logprob_dict = aeppl.factorized_joint_logprob({y_rv: y_vv})
logprob_dict

KeyError                                  Traceback (most recent call last)
<ipython-input-6-2e07a6eebca4> in <module>
      7 y_vv = y_rv.clone()
      8 
----> 9 logprob_dict = aeppl.factorized_joint_logprob({y_rv: y_vv})

~/.local/lib/python3.8/site-packages/aeppl/joint_logprob.py in factorized_joint_logprob(rv_values, warn_missing_rvs, extra_rewrites, **kwargs)
    165             continue
    166 
--> 167         q_rv_value_vars = [
    168             replacements[q_rv_var]
    169             for q_rv_var in outputs

~/.local/lib/python3.8/site-packages/aeppl/joint_logprob.py in <listcomp>(.0)
    166 
    167         q_rv_value_vars = [
--> 168             replacements[q_rv_var]
    169             for q_rv_var in outputs
    170             if not getattr(q_rv_var.tag, "ignore_logprob", False)

KeyError: x

Add specific return type hints to local optimizers

In most of the local optimizers we return a specific class node (eg TransformedRV, MixtureRV). We should be explicit about it in the type hints, instead of using the more generic Optional[List[Node]]

Raise NotImplementedError when factorized_joint_logprob cannot retrieve terms for every value variable

This is a follow up on a discussion that started in #22 (comment)

I believe that whenever aeppl cannot provide a rewrite for a given graph the safest way to proceed is to raise an exception. In the following snippet the returned output is not a "joint_lopgrob" graph at all. The slightly obscure warning that is raised right now sounds insufficient. Users should not be expected to know what kind of graphs aeppl will be able to parse or not, and have to investigate the graph to see if some terms happen to be missing.

import aesara.tensor as at
import aeppl

x_rv = at.random.normal(name="x")
y_rv = at.cos(x_rv)

z_rv = at.random.normal(name='z')  # Just so that the returned output is not `None`

y_vv = y_rv.clone()
z_vv = z_rv.clone()

aeppl.joint_logprob({y_rv: y_vv, z_rv: z_vv})

UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: normal_rv{0, (0, 0), floatX, False}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FDFC27D6D60>), TensorConstant{[]}, TensorConstant{11}, TensorConstant{0.0}, TensorConstant{1.0})

I suggested in that PR to add a simple check at the end of factorized_joint_logprob of the form of:

missing_value_terms = set(original_values.values()) - set(logprob_vars.keys()) 
if missing_value_terms:
    raise NotImplementedError(
        f"The logprob terms for the following value variables could not be derived: {missing_value_terms}"
    )

Revisit VonMises `logpdf` implementation

SciPy's implementation might not be the best to follow, so let's revisit the implementation of this logpdf.

Originally posted by @ricardoV94 in #1 (comment)

Allow for nested logprob rewrites

Copying comments from this exploratory #102 PR:

This PR is a proof of concept for how we could handle composite RVs that require nested aeppl rewrites. The goal is to figure out if this approach is too limited / cumbersome and, if so, what could work better.

Roughly, most of our rewrites come in two parts:

Check if an initially non-measurable variable was assigned to a value variable
If so, try to convert this variable into an temporary measurable variable for which _logprob can be safely dispatched

In step 2, we often need inputs that are themselves measurable variables, but which don't have value variables assigned to them:

aeppl/aeppl/mixture.py

Lines 283 to 287 in 05d0c68

    
           for i, component_rv in enumerate(mixture_res): 
        
               if component_rv in rv_map_feature.rv_values: 
        
                   raise ValueError( 
        
                       f"A value variable was specified for a mixture component: {component_rv}" 
        
                   )

If these inputs are RandomVariables things are fine, as those are measurable by default. Otherwise, we might still be dealing with things that could be measurable... but we will never know because they don't (and shouldn't) have value variables and their corresponding rewrites won't be triggered due to step 1.

The illustrating example in this PR is a scalar mixture that has a clipped variable as one of it's components:

    x = at.clip(at.random.normal(loc=1), 0.5, 1.5); x.name = "x"
    y = at.random.beta(1, 2, size=None, name="y")

    comps = at.stack(x, y); comps.name = "comps"
    idxs = at.random.bernoulli(0.4, size=None, name="idxs")
    mix = comps[idxs]; mix.name = "mix"

    mix_vv = mix.clone()
    idxs_vv = idxs.clone()
    logp = joint_logprob({idxs: idxs_vv, mix: mix_vv})

I have also added a test example with nested scalar mixtures

Suggestion

The suggestion is that any rewrite that depends on other inputs being measurable, can assign a temporary value variable UPSTREAM_VALUE to those inputs. This value variable is automatically discarded by the PreserveRVMappings when such variables are converted to measurable ones by their own rewrites.

Limitations

This would work for some current and future rewrites like #26, but not for everything.

For example nested rewrites that depend on the direct manipulation / creation of value variables such as inc_subtensor and scans would not work. This would also not work when the _logprob dispatched function depends on having RandomVariables as inputs, as happens with non-scalar mixtures:

aeppl/aeppl/mixture.py

Line 354 in 05d0c68

rv_m = rv_pull_down(rv[m_indices])

Also depending on the order with which rewrites are called this may not always work, because the attribution of the UPSTREAM_VALUE is not a visible graph change, and I guess the Equilibrium optimization will stop if after passing through all nodes, none is changed

Implement HMC-related `RandomVariable` transforms

We now have a good framework for easily implementing the HMC-related RandomVariable transforms (or, as the ML people have (re)labeled them, "normalizing flows").

We can start by creating a single Aesara rewrite (e.g. using aesara.graph.opt.local_optimizer) that replaces transformed RandomVariables with an intermediate form in the "sample-space" graph (i.e. the first input to joint_logprob). That intermediate form could be a new RandomVariable type (e.g. TransformedRV) with a _logprob dispatch that implements the change-of-variables formula (including the Jacobian). This new RandomVariable type would essentially be a wrapper around the original RandomVariable that only serves to provide a distinct _logprob dispatch.

Rewrite to remove normalization constants from joint_logprob

There was some work documented here already: pymc-devs/pymc#4396

Would be a great feature to include in the library.

CC'ing @manthehunted on the off-chance that he/she is still interested in this

Implement Multivariate mixtures

Our mixture logprob graphs are tailored for univariate mixtures, by either relying on rv_pull_down rewrites that only work for univariate random variables or assuming there is a 1-1 mapping between the shape value variable and the shape of each random variable component, which is not the case for multivariate distributions.

aeppl/aeppl/mixture.py

Line 335 in 3331081

rv_m = rv_pull_down(rv[m_indices])

aeppl/aeppl/mixture.py

Lines 344 to 347 in 3331081

    
           logp_val += ifelse( 
        
               at.eq(indices[0], i), 
        
               comp_logp, 
        
               at.zeros_like(value),

aeppl/aeppl/mixture.py

Lines 317 to 320 in 3331081

    
           original_shape = shape_tuple(comp_rvs[0]) 
        
           bcast_indices = expand_indices(indices, original_shape) 
        
           logp_val = at.empty(bcast_indices[0].shape)

The meta information present in RandomVariable.[ndim_supp, ndims_params] and the logic in aesara.tensor.random.utils.broadcast_params`s should give us the tools to infer the right base log-probability shape.

This could also be aided by aesara-devs/aesara#695

Support "transitive" `Scan` log-probabilities

Currently, Scan log-probability support only handles cases in which the MeasurableVariable is created inside the body/step function of the Scan, and not when the body/step function simply references a MeasurableVariable that is being iterated over by the Scan.

For example, the following is not supported:

import aesara
import aesara.tensor as at

from aeppl.joint_logprob import factorized_joint_logprob


srng = at.random.RandomStream(seed=2320)
N = 10

Y_rv = srng.normal(0, 1, size=N, name="Y")


def step_fn(y_t):
    return y_t


Y_1T_rv, _ = aesara.scan(
    fn=step_fn,
    sequences=[Y_rv],
    strict=True,
)

y_vv = Y_1T_rv.clone()
y_vv.name = "y"

logp_parts = factorized_joint_logprob({Y_1T_rv: y_vv})

This example is very trivial, but, if we change step_fn so that it performs a supported, measurable operation on y_t (e.g. indexing a mixture), it wouldn't work for the same reason.

When a value is assigned to Scan output terms like Y_1T_rv, we could "push" the relevant sequences inputs into the step function. In other words, we could construct the type of graph we currently handle.

Working from the example above, we would rewrite the Scan into something like the following:

# Apply a rewrite like `local_rv_size_lift` to get properly `size`-broadcasted parameters
# in a new variable `new_Y_rv`
mu_bcast, sigma_bcast = new_Y_rv.owner.inputs[3:]

def new_step_fn(mu_t, sigma_t)
    return Y_rv.owner.op(mu_t, sigma_t, name="Y[t]")

new_Y_1T_rv, _ = aesara.scan(
    fn=new_step_fn,
    sequences=[mu_bcast, sigma_bcast],
    strict=True,
)

Decide how to deal with automatic broadcast in derived RVs

Both #26 and #22 raise the question of how to deal with automatic RVs whose value_variable results from an intermediate broadcast of the base RV

loc = at.vector("loc")
y_base_rv = at.random.normal(0, 1, name="y_base_rv", size=2)
y_rv = loc + y_base_rv
y_val = y_rv.clone()

logp = joint_logprob(y_rv, {y_rv: y_val})
logp.eval({y_val: [0, 0, 0, 0], loc_val: [0, 0, 0, 0]})

The computed logp is only correct when loc is of size==2.
This might be even more complicated if the y_base_rv inputs are themselves variable (meaning it may itself have not a fixed size)

I see two ways we could deal with this:

Add a shape assert for y_base_rv.shape == y_val.shape in the returned logprob
Ignore the situation and put the onus on the users to specify correctly sized variables for derived logprobs
Something else?

`joint_logprob` requires non `RandomVariable` inputs in `rv_values`

The following snippet is failing locally:

import aesara
import aesara.tensor as at
from aeppl import joint_logprob

x = at.scalar('x')
beta_rv = at.random.normal(0, 1, name='beta')
y_rv = at.random.normal(beta_rv*x, 1, name='y')

beta = beta_rv.type()
y = y_rv.type()

logp = joint_logprob(y_rv, {beta_rv: beta, y_rv: y})
logp_fun = aesara.function([x, beta, y], logp)

---------------------------------------------------------------------------

UnusedInputError                          Traceback (most recent call last)

<ipython-input-1-ea0e0eb814fa> in <module>
     11 
     12 logp = joint_logprob(y_rv, {beta_rv: beta, y_rv: y})
---> 13 logp_fun = aesara.function([x, beta, y], logp)

~/Documents/Projects/aeppl/venv/lib/python3.8/site-packages/aesara/compile/function/__init__.py in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
    335         # note: pfunc will also call orig_function -- orig_function is
    336         #      a choke point that all compilation must pass through
--> 337         fn = pfunc(
    338             params=inputs,
    339             outputs=outputs,

~/Documents/Projects/aeppl/venv/lib/python3.8/site-packages/aesara/compile/function/pfunc.py in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
    522         inputs.append(si)
    523 
--> 524     return orig_function(
    525         inputs,
    526         cloned_outputs,

~/Documents/Projects/aeppl/venv/lib/python3.8/site-packages/aesara/compile/function/types.py in orig_function(inputs, outputs, mode, accept_inplace, name, profile, on_unused_input, output_keys)
   1970     try:
   1971         Maker = getattr(mode, "function_maker", FunctionMaker)
-> 1972         m = Maker(
   1973             inputs,
   1974             outputs,

~/Documents/Projects/aeppl/venv/lib/python3.8/site-packages/aesara/compile/function/types.py in __init__(self, inputs, outputs, mode, accept_inplace, function_builder, profile, on_unused_input, fgraph, output_keys, name)
   1573 
   1574         # Check if some input variables are unused
-> 1575         self._check_unused_inputs(inputs, outputs, on_unused_input)
   1576 
   1577         # Make a list of (SymbolicInput|SymblicInputKits, indices,

~/Documents/Projects/aeppl/venv/lib/python3.8/site-packages/aesara/compile/function/types.py in _check_unused_inputs(self, inputs, outputs, on_unused_input)
   1745                     )
   1746                 elif on_unused_input == "raise":
-> 1747                     raise UnusedInputError(msg % (inputs.index(i), i.variable, err_msg))
   1748                 else:
   1749                     raise ValueError(

UnusedInputError: aesara.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 0 is not part of the computational graph needed to compute the outputs: x.
To make this error into a warning, you can pass the parameter on_unused_input='warn' to aesara.function. To disable it completely, use on_unused_input='ignore'.

This way, it works as expected:

logp = joint_logprob(y_rv, {beta_rv: beta, y_rv: y, x:x})

Show an actual logp computation in README

It's a small suggestion, but I think showing an actual logp evaluation can make it much more intuitive what the library is about. Specially for people who are not familiar with aesara/theano.

API of aeppl.factorized_joint_logprob

During discussion of pymc-devs/pymc#5155 we (@ricardoV94 and I) also talked a bit about the API for aeppl.factorized_joint_logprob and how it deals with random variables that are not specified at all.

I think it might be better to change the api a bit so that all random variables in the graph have to be specified somehow. That would make it more future proof (eg implement marginalization using closed form solution or some kind of numerical integration) and also make it closer to the math notation and harder for users to shoot themselves into the foot.

Let's say we have a graph like this:

import aesara
import aesara.tensor as at
import aeppl

a = at.random.normal()
b = at.random.normal(loc=a)

a_val = at.dscalar()
b_val = at.dscalar()

We could now be interested in those values (math notation):

# joint logp
P(a = a_val, b = b_val)

# some conditional logps
P(a = a_val | b = b_val) = P(a = a_val)
P(b = b_val | a = a_val)

# marginal logps
P(a = a_val) = \int_b_val P(a = a_val | b = b_val) P(b = b_val)  # integral doesn't do anything...
P(b = b_val) = \int_a_val P(b = b_val | a = a_val) P(b = b_val)  # this one does

# Random variables whose expectation is the marginal logp (but that might converge *really* slowly or even not at all in many cases)
P_rv(a = a_val | b = B), where B ~ b  # not sure actually how to write this down properly
P_rv(b = b_val | a = A), where A ~ a

If we call aeppl.factorized_joint_logprob(all_vars), we get all the conditional logps: {key_var: P(key_var = key_val | all_remaining) for (key_var, key_val) in all_vars}, which seems fine to me (although I might only be interested in some of those in the first place, but I think this is fine anyway).

If we call aeppl.factorized_joint_logprob({b: b_val}) however, we actually get the random variable instead of the marginal logp, even though the call looks almost like the math notation of the marginal logp.

We could make this choice explicit by switching this use case to an api like this:

aeppl.factorized_joint_logprob({b: b_val}, sample=[a])

Which we could later extend to marginalization using something those:

aeppl.factorized_joint_logprob({b: b_val}, marginalize=[a])  # closed form solution only
aeppl.factorized_joint_logprob({b: b_val}, marginalize=[a], integration_options={a: "gauss_hermite(21)"}) # some numerical integration

Interference between new transforms and `tag.value_var`

I didn't have time to figure out the source of the problem, but the test suite in PyMC started to fail multiple places with aeppl 0.0.16:

https://github.com/pymc-devs/pymc/pull/5185/checks
https://github.com/pymc-devs/pymc/pull/5159/checks

Numerical stability issues with the jacobian of transformed variables

When dealing with transformed variables we first back-transform them so that they can be safely used as inputs to other downstream variables in here:

aeppl/aeppl/transforms.py

Line 145 in 18275e2

new_value_var = transform.backward(value_var, *trans_node.inputs)

However, we later have to forward-transform it again in order to compute the jacobian term:

aeppl/aeppl/transforms.py

Lines 86 to 87 in 18275e2

    
           original_forward_value = op.transform.forward(value, *inputs) 
        
           jacobian = op.transform.log_jac_det(original_forward_value, *inputs)

This can lead to unnecessary numerical instability in models with complex transformations. The Interval transform is one such case and led to a PyMC performance regression documented in pymc-devs/pymc#5088

import aesara
import aesara.tensor as at
from aesara.compile.mode import Mode
from aeppl.transforms import IntervalTransform

a, b, x = at.scalars('a', 'b', 'x')
intvl = IntervalTransform(args_fn=lambda *args: (a, b))
graph = intvl.forward(intvl.backward(x))

graph_fn = aesara.function([a, b, x], graph, mode=Mode().excluding('fusion'))
aesara.dprint(graph_fn)

Elemwise{Sub}[(0, 0)] [id A] ''   9
 |Elemwise{Log}[(0, 0)] [id B] ''   7
 | |Elemwise{sub,no_inplace} [id C] ''   5
 |   |Elemwise{Add}[(0, 0)] [id D] ''   4
 |   | |Elemwise{mul,no_inplace} [id E] ''   1
 |   | | |Elemwise{sigmoid,no_inplace} [id F] ''   0
 |   | | | |x [id G]
 |   | | |b [id H]
 |   | |Elemwise{Mul}[(0, 0)] [id I] ''   3
 |   |   |Elemwise{Sub}[(0, 1)] [id J] ''   2
 |   |   | |TensorConstant{1.0} [id K]
 |   |   | |Elemwise{sigmoid,no_inplace} [id F] ''   0
 |   |   |a [id L]
 |   |a [id L]
 |Elemwise{Log}[(0, 0)] [id M] ''   8
   |Elemwise{Sub}[(0, 1)] [id N] ''   6
     |b [id H]
     |Elemwise{Add}[(0, 0)] [id D] ''   4

The graph is far from optimized and I don't see an elegant way how we could rewrite it away. This is probably true for the Simplex transform as well.

One alternative would be to make the original value variable an additional input of the TransformedRV, enabling us to use it directly when computing the jacobian term. This would also be future proof in that we don't need to add specialized rewrites for new transforms that may be added in aeppl or in PyMC

Support mixtures defined with `switch`

We need to add support for mixtures defined using the Elemwise + Switch Op.

The following is a simple example that isn't supported:

import aesara.tensor as at

from aeppl.joint_logprob import factorized_joint_logprob


srng = at.random.RandomStream(seed=2320)

I_rv = srng.bernoulli(0.5, size=10, name="I")
X_rv = srng.normal(0, 1, name="X")
Y_rv = srng.gamma(0.5, 0.5, size=10, name="Y")

Z_rv = at.switch(I_rv, X_rv, Y_rv)
Z_rv.name = "Z"

z_vv = Z_rv.clone()
i_vv = I_rv.clone()
logp_parts = factorized_joint_logprob({Z_rv: z_vv, I_rv: i_vv})

Simplex transform upcasts float32 input to float64

This is leading to failing tests in pymc-devs/pymc#5170

from aeppl.transforms import Simplex
import aesara.tensor as at

x = at.fmatrix('x')
assert x.dtype == "float32"
forw_x = Simplex().forward(x)
assert forw_x.dtype == "float32"  # Raises

It happens because of this division by shape which is by default int64:

aeppl/aeppl/transforms.py

Line 268 in 808236f

shift = at.sum(log_value, -1, keepdims=True) / value.shape[-1]

Allow for `Potential` terms in `joint_logprob`

Potential terms are arbitrary terms that are added to the logprob graph without "values".
Would something like this make sense?

x_rv = at.random.normal()
potential = at.switch(x_rv < 0, -np.inf, 0)

x = x_rv.clone()

joint_logprob([x_rv, potential], {x_rv: x, potential: None})

Add general timeseries support for `joint_logprob`

We can add support for Scan Ops so that joint_logprob would cover nearly arbitrary timeseries.

A lot of the relevant Scan rewrite code was written for symbolic-pymc in pymc-devs/symbolic-pymc#113 and pymc-devs/symbolic-pymc#114, so we can take some of the code/ideas from there and adapt it to aeppl.

See the Gitter conversation about this idea here. (NB: If you use https://develop.element.io/ you can also turn on LaTeX rendering in the "Labs" settings.)

Also, see these tests for explicit examples of log-likelihood generation for a simple DLM and HMM in symbolic-pymc.

Add affine transform support to `total_logprob`

With a new local_optimizer addition to RVSinker, we could add support for affine/linear transforms of RandomVariables.

In other words, the following could be made to work:

import aesara.tensor as at

from aeppl import joint_logprob


a, b = at.scalars("ab")
Y_rv = a * at.random.normal(0, 1, name="Y") + b

y_val = Y_rv.type()
logp = joint_logprob(Y_rv, {Y_rv: y_val})

How to generate graph with nested transforms?

I don't understand how to generate a graph with multiple automatic transforms:

import aesara
import aesara.tensor as at
from aesara.graph.opt import in2out, out2in

from aeppl import joint_logprob
from aeppl.transforms import transform_logprob

# Does not matter if I set ignore_newtrees to True, or if I use out2in
transform_opt = in2out(transform_logprob, ignore_newtrees=False)

l_rv = at.random.uniform(0, 1, name='l_rv')
u_rv = at.random.uniform(2, 4, name='u_rv')
# l_rv = at.random.normal(name='l_rv')
# u_rv = at.random.normal(name='u_rv')
y_rv = at.random.uniform(l_rv, u_rv, name='y_rv')

l = l_rv.type()
l.name = 'l'
u = u_rv.type()
u.name = 'u'
y = y_rv.type()
y.name='y'

logp = joint_logprob(y_rv, {y_rv:y, l_rv:l, u_rv:u}, extra_rewrites=transform_opt)
logp.eval({l: -1, u:1, y:0})

AttributeError: 'numpy.random._generator.Generator' object has no attribute 'uniform_interval'

I changed this line so that the name would not be a tuple:

aeppl/aeppl/transforms.py

Line 232 in b179a6e

cls_dict["name"] = (f"{rv_name}_{trans_name}",)

to cls_dict["name"] = f"{rv_name}_{trans_name}"

But otherwise the error comes from the same source, i.e., trying to access the rng_fn of the uniform_interval

Default transform log_jac_det doesn't take the absolute value of the determinant

At the moment, the default RVTransform.log_jac_det method isn't taking the absolute value of the Jacobian determinant. See here.

That line should be changed to return at.log(at.abs(at.nlinalg.det(at.atleast_2d(jacobian(phi_inv, [value])))))

Revert #82

See #82 (comment).

Broadcasted variable not replaced in downstream terms

Please provide a minimal, self-contained, and reproducible example.

import aesara.tensor as at
import aeppl

cov = at.eye(2)
mu_rv = at.random.normal(size=2, name='mu')
mu_broadcasted = at.broadcast_arrays(mu_rv, cov[..., -1])[0]
# mu_broadcasted = mu_rv  # No error if this line is uncommented
x_rv = at.random.multivariate_normal(mu_broadcasted, cov, name='x')

mu_vv = mu_rv.clone()
x_vv = x_rv.clone()
logp = aeppl.factorized_joint_logprob({mu_rv: mu_vv, x_rv: x_vv})

aeppl/joint_logprob.py:161: UserWarning: Found a random variable that was neither among the observations nor the conditioned variables: normal_rv{0, (0, 0), floatX, False}(RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F51B7F3EC80>), TensorConstant{[]}, TensorConstant{11}, TensorConstant{(2,) of 0.0}, TensorConstant{(2,) of 1.0})
  warnings.warn(

logp[x_vv].eval({mu_vv: [0, 0], x_vv: [0, 0]})

UnusedInputError                          Traceback (most recent call last)
/tmp/ipykernel_55923/2490478877.py in <module>
----> 1 logp[x_vv].eval({mu_vv: [0, 0], x_vv: [0, 0]})

~/miniconda3/envs/aeppl/lib/python3.9/site-packages/aesara/graph/basic.py in eval(self, inputs_to_values)
    548         inputs = tuple(sorted(inputs_to_values.keys(), key=id))
    549         if inputs not in self._fn_cache:
--> 550             self._fn_cache[inputs] = function(inputs, self)
    551         args = [inputs_to_values[param] for param in inputs]
    552 

~/miniconda3/envs/aeppl/lib/python3.9/site-packages/aesara/compile/function/__init__.py in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
    335         # note: pfunc will also call orig_function -- orig_function is
    336         #      a choke point that all compilation must pass through
--> 337         fn = pfunc(
    338             params=inputs,
    339             outputs=outputs,

~/miniconda3/envs/aeppl/lib/python3.9/site-packages/aesara/compile/function/pfunc.py in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
    522         inputs.append(si)
    523 
--> 524     return orig_function(
    525         inputs,
    526         cloned_outputs,

~/miniconda3/envs/aeppl/lib/python3.9/site-packages/aesara/compile/function/types.py in orig_function(inputs, outputs, mode, accept_inplace, name, profile, on_unused_input, output_keys)
   1971     try:
   1972         Maker = getattr(mode, "function_maker", FunctionMaker)
-> 1973         m = Maker(
   1974             inputs,
   1975             outputs,

~/miniconda3/envs/aeppl/lib/python3.9/site-packages/aesara/compile/function/types.py in __init__(self, inputs, outputs, mode, accept_inplace, function_builder, profile, on_unused_input, fgraph, output_keys, name)
   1574 
   1575         # Check if some input variables are unused
-> 1576         self._check_unused_inputs(inputs, outputs, on_unused_input)
   1577 
   1578         # Make a list of (SymbolicInput|SymblicInputKits, indices,

~/miniconda3/envs/aeppl/lib/python3.9/site-packages/aesara/compile/function/types.py in _check_unused_inputs(self, inputs, outputs, on_unused_input)
   1746                     )
   1747                 elif on_unused_input == "raise":
-> 1748                     raise UnusedInputError(msg % (inputs.index(i), i.variable, err_msg))
   1749                 else:
   1750                     raise ValueError(

UnusedInputError: aesara.function was asked to create a function computing outputs given certain inputs, but the provided input variable at index 0 is not part of the computational graph needed to compute the outputs: mu.
To make this error into a warning, you can pass the parameter on_unused_input='warn' to aesara.function. To disable it completely, use on_unused_input='ignore'.

First seen in pymc-devs/pymc#5382 (comment)

Aeppl does not respect floatX

Description of your problem or feature request

Please provide a minimal, self-contained, and reproducible example.

import aeppl
import aesara.tensor as at
import aesara
with aesara.config.change_flags(floatX="float32", warn_float64="raise"):
    aeppl.logprob(at.random.normal(), at.constant(1.))

Please provide the full traceback of any errors.

/home/ferres/.miniconda3/envs/pymc/lib/python3.9/site-packages/aeppl/logprob.py(75)normal_logprob()
-> - at.log(np.sqrt(2.0 * np.pi))
(Pdb) l
 70     def normal_logprob(op, values, *inputs, **kwargs):
 71         (value,) = values
 72         mu, sigma = inputs[3:]
 73         res = (
 74             -0.5 * at.pow((value - mu) / sigma, 2)
 75  ->         - at.log(np.sqrt(2.0 * np.pi))
 76             - at.log(sigma)
 77         )
 78         res = Assert("sigma > 0")(res, at.all(at.gt(sigma, 0.0)))
 79         return res

Please provide any additional information below.

Should check more random variables
blocks pymc-devs/pymc#4582

Versions and main components

Aesara version:
Aesara config (python -c "import aesara; print(aesara.config)")
Python version:
Operating system:
How did you install Aesara: (conda/pip)

Use `feature` to track the `ignore_logprob` flag?

Seems more consistent to use a Feature dictionary to track this type of graph information vs adding it to the tag of the nodes.

Is there any downside?

Implement RV convolutions

We could try to wrap some method from scipy.integrate and use it to convolve continuous variables in a joint graph of z = x + y where neither variable {x, y} is being conditioned on.

I don't know how feasible this would be, since scipy requires a callable. Can we use something like OpFromGraph?

Obviously we can (and should) implement the optimized graphs for variables that have simple analytical form such as the convolution of two gaussians (which I think pymc-symbolic could already do)?

Better indenting for pretty-printing

The Aesara pretty-printers do not indent their results. We could really use some good column-aware indentation to make the print output readable.

For example, the IPython pretty printing library has a convenient means of making parts of a string "breakable" (e.g. I used it here).

Aesara missing from requirements.txt

Just noticed it when creating a new environment for the repo

Remove unnecessary warnings in `joint_logprob`

Unnecessary warnings appear to be emitted by test_joint_logprob_subtensor. In at least one instance, it looks like they're triggered by RandomStateType objects.

Make transforms more flexible

The current transforms rewrite looks for all RVs that have a standard TransformedRV and automatically converts those (and the respective value variables). To plugin aeppl into PyMC3 in the future, we probably need something more flexible where we specify what transforms we want to apply to which value variables.

This should also allow us to apply transforms to value variables that are associated with dynamically derived RVs.

For example:

import aesara.tensor as at
from aeppl import joint_logprob

shift_rv = at.random.normal(0, 1)
y_rv = at.random.exponential(1) + shift

shift = shift_rv.clone()
y = y_rv.clone()

logp = joint_logprob(y_rv, {y_rv: y, shift_rv: shift}, transfoms={y: Interval(shift_rv, None)})

I think the logic could still be implemented as a custom "rewrite", I just used a new keyword to simplify the example.

`joint_logprob` yields wrong results due to broadcasting

We probably need to collapse each factor to a scalar to avoid double counting issues like this:

import aesara.tensor.random as atr
from aeppl import joint_logprob
#%%

x_rv = atr.normal(name="x")
y_rv = atr.normal(x_rv, 1, size=2, name="y")

x = x_rv.clone()
y = y_rv.clone()

logprob = joint_logprob([y_rv], {x_rv: x, y_rv: y})

eval1 = logprob.eval({x: 0, y: [0, 0]}).sum()
eval1  # -3.6757541328186907
#%%

x_rv = atr.normal(name="x")
y_rv1 = atr.normal(x_rv, 1, name="y1")
y_rv2 = atr.normal(x_rv, 1, name="y2")

x = x_rv.clone()
y1 = y_rv1.clone()
y2 = y_rv2.clone()

logprob = joint_logprob([y_rv1, y_rv2], {x_rv: x, y_rv1: y1, y_rv2: y2})

eval2 = logprob.eval({x: 0, y1: 0, y2: 0})
eval2  # array(-2.7568156)

This is also more easily dealt with if we work with something like #47, where we can decide how to combine terms after we collect them.

Implement pretty printer for `Scan`

Currently, Scan does not have a pprint implementation.

Marginalize over latent variables

I wonder whether it would be possible to rewrite the logp graphs to marginalize over finite discrete variables, indicated by the user (not necessarily all that are in the graph).

x_rv = at.random.bernoulli(0.7)
y_rv = at.normal([0, 1], [1, 1])[x_rv]
y = y_rv.type()

logp = joint_logprob(y_rv, {y_rv: y}, marginalize={x_rv})

Whose logp would be p(y_rv=y | x_rv=0) * p(x_rv=0) + p(y_rv=y | x_rv=1) * p(x_rv=1)

This is straightforward(ish) if the marginalization happens just above the requested variable (e.g., y_rv), but gets more complicated if it happens at the top of a deeper graph.

Implement logprob for order statistics

The idea would be to derive the logprob for the following type of graphs

y_rv = at.max(at.random.uniform(0, 1, size=3))   # or min
y_rv = at.sort(at.random.uniform(0, 1, size=3))
y_rv = at.sort(at.random.uniform(0, 1, size=3))[idx]  # max / min correspond to idx==-1 or idx==0

https://en.wikipedia.org/wiki/Order_statistic#Probabilistic_analysis

This might be a bit far-fetched / difficult to find a good general solution that goes beyond a few simple cases (e.g, order statistics with non-i.i.d RVs):

y_rv = at.max(at.random.uniform([0, 1], [2, 3]))
y_rv = at.max(at.stack([at.random.uniform(0, 1), at.random.normal(0, 1)]))

Update `joint_logprob` docstring

Since joint_logprob will be a likely point-of-entry for most people, we should make its signature more explicit and find a way to reuse parts of the docstring from factorized_joint_logprob.

Value transforms not working with `Scan`

import numpy as np

import aesara
import aesara.tensor as at

import aeppl
from aeppl.transforms import TransformValuesOpt, LogOddsTransform

init = at.random.beta(1, 1)
innov, _ = aesara.scan(
    fn=lambda prev_innov: at.random.beta(prev_innov * 10, (1 - prev_innov) * 10),
    outputs_info=[init],
    n_steps=10,
)

init_vv = init.clone()
innov_vv = innov.clone()

tr = TransformValuesOpt({
    init_vv: LogOddsTransform(),
    innov_vv: LogOddsTransform(),
})
logp = aeppl.joint_logprob({init: init_vv, innov: innov_vv}, extra_rewrites=tr)

# init_vv is properly transformed
logp.eval({init_vv: -0.5, innov_vv: np.full(10, 0.5)})
# array(7.22123237)

# innov_vv is not, eval raises ParameterValueError
logp.eval({init_vv: -0.5, innov_vv: np.full(10, -0.5)})

---------------------------------------------------------------------------
ParameterValueError                       Traceback (most recent call last)
scan_perform.pyx in aesara.scan.scan_perform.perform()

ParameterValueError: 0 <= value <= 1, alpha > 0, beta > 0

	for i, component_rv in enumerate(mixture_res):
	if component_rv in rv_map_feature.rv_values:
	raise ValueError(
	f"A value variable was specified for a mixture component: {component_rv}"
	)

	logp_val += ifelse(
	at.eq(indices[0], i),
	comp_logp,
	at.zeros_like(value),

	original_shape = shape_tuple(comp_rvs[0])
	bcast_indices = expand_indices(indices, original_shape)

	logp_val = at.empty(bcast_indices[0].shape)

	original_forward_value = op.transform.forward(value, *inputs)
	jacobian = op.transform.log_jac_det(original_forward_value, *inputs)

aesara-devs / aeppl Goto Github PK

aeppl's People

Contributors

Stargazers

Watchers

Forkers

aeppl's Issues

Description of your problem or feature request

Versions and main components

Description of your problem or feature request

Suggestion

Limitations

Description of your problem or feature request

Versions and main components

Recommend Projects

Recommend Topics

Recommend Org