probsys / sppl Goto Github PK
View Code? Open in Web Editor NEWProbabilistic programming system for fast and exact symbolic inference
License: Apache License 2.0
Probabilistic programming system for fast and exact symbolic inference
License: Apache License 2.0
On the conditional distribution of a multivariate Normal given a transformation – the linear case
https://www.sciencedirect.com/science/article/pii/S240584401831449X
Random sampling from a truncated multivariate normal distribution
https://www.sciencedirect.com/science/article/pii/0893965994900426
Blocked by #13
This way the child nodes do not have to keep solving the same event over and over
Currently it converts them to integers.
Convert an SPN
Python object into an SPML
program.
Currently a constant c
must be numeric, by creating a polynomial (0*X + c)
. In general, we can have constants in symbolic (nominal) domains, for Piecewise expressions such as
Y = 'negative' * (X < 0) + 'positive' * (X > 0) + 'zero' * (X << {0})
Related #19
When expressing a Multi-Mixture program (synthesized e.g., using the CrossCat prior), we have:
model = (Start
# View 0 has 3 variables and 2 clusters.
& Z[0] >> NominalDist({'0': .8, '1': .2})
& Cond (
Z[0] << {'0'},
X[0] >> Norm(loc=0, scale=1)
& X[1] >> Norm(loc=0, scale=1)
& X[2] >> Norm(loc=0, scale=1),
Z[0] << {'1'},
X[0] >> Norm(loc=10, scale=1)
& X[1] >> Norm(loc=10, scale=1)
& X[2] >> Norm(loc=10, scale=1))
# View 1 has 2 variables and 3 clusters.
& Z[1] >> NominalDist({'0': .2, '1': .8, '2': 0})
& Cond (
Z[1] << {'0'},
X[3] >> Norm(loc=0, scale=1)
& X[4] >> Norm(loc=0, scale=1),
Z[1] << {'1'},
X[3] >> Norm(loc=10, scale=1)
& X[4] >> Norm(loc=10, scale=1),
))
The root should be a product not a sum.
Currently it converts them to integers.
The current interpreter is written for an imperative modeling language, where the SPN is part of the "background" environment that is incrementally built with the execution of each statement.
An alternative approach is to write a functional modeling language, where each expression yields an SPN. (Will make use of let
).
i.e., add more cases to this method
https://github.com/probcomp/sum-product-dsl/blob/d2745c965d19e14ff7ed6368da37e03c48b0e0e9/src/transforms.py#L289-L311
One possibility is to have a generic Constant
transform, in which case we can do something like:
Y = ('high')*( X << {'a', 'b'}) + ('low')*(X << {'c'})
To avoid situations like the following, where iterating over the support requires a confusing type conversion...
A xor B = (A and ~B) or (~A and B)
Surprising that Python does not automatically implement this...
Can significantly reduce system complexity I think.
Faulty reasoning in disjoint-union algorithm for probabilities.
Key idea: Pr[A or B] // where A and B are in DNF
= Pr[A or (B and ~A)] // disjoint union property
= Pr[A] + Pr[B and ~A] // since events are disjoint
Since A and B are in DNF, write them as
A = ϕ(A; X) and ϕ(A; Y)
B = ϕ(B; X) and ϕ(B; Y)
To assess B and ~A, we have:
[ϕ(B; X) and ϕ(B; Y)] and ~[ϕ(A; X) and ϕ(B; Y)]
= [ϕ(B; X) and ϕ(B; Y)] and [~ϕ(A; X) or ~ϕ(B; Y)]
= [ϕ(B; X) and ϕ(B; Y) and ~ϕ(A; X)]
or [ϕ(B; X) and ϕ(B; Y) and ~ϕ(B; Y)]
= [ϕ(B; X) and ~ϕ(A; X) and ϕ(B; Y)]
or [ϕ(B; X) and ϕ(B; Y) and ~ϕ(B; Y)]
the issue is that clauses in this DNF expression not necessarily
disjoint.
Example:
ϕ(A; X) = X > 0
ϕ(A; Y) = Y < 1
ϕ(B; X) = X < 1
ϕ(B; Y) = Y < 3
Then the final expression is:
[(X < 1) and ~(X > 0) and (Y < 3)]
or [(X < 1) and (Y < 3) and ~(Y < 1)]
= [(X < 1) and (X ≤ 0) and (Y < 3)]
or [(X < 1) and (Y < 3) and (Y ≥ 1)]
= [ (X ≤ 0) and (Y < 3) ] or [ (X < 1) and (1 ≤ Y < 3) ]
which is not disjoint, and reduces to inclusion-exclusion.
i.e., (X > 0) & ((X < 0) | Y > 0))
-> ((X>0) & (X < 0)) | ((X > 0) & (Y > 0))
-> ((X > 0) & (Y > 0))
Reintroduce after a clean semantics is established for mixed-type distributions.
Specifically, apply conjunctions = set([...])
An event is a Transform with a binary domain.
Currently it converts them to integers.
Should return a dictionary of solutions, one for each symbol.
The model is as follows:
Using the updated SPN DSL parser, we express it as:
X = Identity('X')
model = 0.5 * ( # American student
0.99 * Uniform(X, loc=0, scale=4) | \
0.01 * Atomic(X, loc=4)) | \
0.5 * ( # Indian student
0.99 * Uniform(X, loc=0, scale=10) | \
0.01 * Atomic(X, loc=10))
Example of present behavior
print((1.5 <= X) & ((1 <= X) < 2))
(1.5 < X) & (1 <= X < 2)
If the subexpressions match, we can simplify it to 1.5 < X < 2
There is not much of a simplification that we can do for Or.
Using reduce with &
and |
will correctly handle the simplification
Saves computing all the subsets terms involving those conjunctions
https://github.com/probcomp/sum-product-dsl/blob/aa2a5024c624cada38fb96559600ba40f4d3e0e9/src/spn.py#L295-L299
Everything having to do with "invert" should use a variable name "y", not "x", for consistency. Intervals, intersections, values, and finite sets should also all be "y".
Really these should be called ContinuousReal and DiscreteReal, not Numerical and Ordinal
To match test_transforms_solve
Duplicate work factoring the query is done at each Product node in the network!
It should suffice to specify the SPN using Python if-else notation and leveraging the ast module (extended docs)
For example, we can define the Indian GPA problem as:
nationality = choose({'India': 0.5, 'USA': 0.5})
score = choose({'imperfect': 0.99, 'perfect': 0.01})
if (nationality == 'India'):
if (score == 'perfect'):
gpa = Atomic(10)
if (score == 'imperfect'):
gpa = Uniform(0, 10)
if (nationality == 'USA'):
if (score == 'perfect'):
gpa = Atomic(4)
if (score == 'imperfect'):
gpa = Uniform(0, 4)
In general, any SPN will have this specific structure.
Partition of domain specified in terms of general (solvable) events.
Symbol of all subexprs must be identical.
Symbol of events must match symbol of subexprs.
Solutions of events may not overlap.
Syntax for a piecewise function (related: #6, so that events are transforms):
Y = (exp(X)) * (X << {0, 1)) + X**2 * (0 < X < 1)
The interesting typing judgments are around addition and multiplication for events and piecewise.
>>> (X << {'a'}).solve()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/scratch/fsaad/sum-product-dsl/build/lib/spn/transforms.py", line 711, in solve
return self.invert({1})
File "/scratch/fsaad/sum-product-dsl/build/lib/spn/transforms.py", line 53, in invert
return self.invert_finite(intersection)
File "/scratch/fsaad/sum-product-dsl/build/lib/spn/transforms.py", line 764, in invert_finite
return self.subexpr.invert(values_prime)
File "/scratch/fsaad/sum-product-dsl/build/lib/spn/transforms.py", line 59, in invert
assert False, 'Unknown intersection: %s' % (intersection,)
AssertionError: Unknown intersection: Intersection({a}, Union({-oo, oo}, Reals))
>>> (~(X << {'a'})).solve()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/scratch/fsaad/sum-product-dsl/build/lib/spn/transforms.py", line 711, in solve
return self.invert({1})
File "/scratch/fsaad/sum-product-dsl/build/lib/spn/transforms.py", line 53, in invert
return self.invert_finite(intersection)
File "/scratch/fsaad/sum-product-dsl/build/lib/spn/transforms.py", line 764, in invert_finite
return self.subexpr.invert(values_prime)
File "/scratch/fsaad/sum-product-dsl/build/lib/spn/transforms.py", line 59, in invert
assert False, 'Unknown intersection: %s' % (intersection,)
AssertionError: Unknown intersection: Reals \ {a}
The solution is to create two classes EventFiniteReal
and EventFiniteNominal
and handle the inversion appropriately under each case.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.