improbable-research / keanu Goto Github PK

View Code? Open in Web Editor NEW

149.0 149.0 32.0 15.12 MB

A probabilistic approach from an Improbabilistic company

License: MIT License

Java 86.48% Kotlin 1.09% FreeMarker 0.06% Python 12.02% Shell 0.27% Makefile 0.02% Batchfile 0.02% Dockerfile 0.05%

keanu's People

Contributors

Stargazers

Watchers

keanu's Issues

[Question] Are SonarQube fixes welcome?

Hello,

I'm willing to contribute with fixes for some SonarQube rules. Is that something of interest for the project? If yes, should I just open pull requests?

Thanks!

Custom proposal distributions for metropolis hastings

Is your feature request related to a problem? Please describe.
My metropolis hastings samples are mixing poorly and I'd like to explore the effect of using a lower variance proposal distribution than my priors for some vertices

Describe the solution you'd like
The option to control proposal dist choice separately to the prior when calling

sampling.sample(net=net,
sample_from=net.get_latent_vertices(),
algo='metropolis',
draws=2000)

Describe alternatives you've considered
Using NUTS so step size is adaptive - this currently not helping, so would be great to have more control so can investigate behaviour of samplers

Additional context
Implementing a hierarchical linear model similar to this http://www.sanjogmisra.com/blog/stan/

copying/cloning vertices?

I have found myself wanting to create copies of a vertex e.g. if I want to use the same prior distribution for different vertices. Is there a simple way to do this in Java or would vertices have to implement their own .clone() or .copy() method to create a deep copy?

Py4JException when calling Keanu methods in python

Describe the bug
I can import keanu-python, but Py4J throws an error whenever I call a keanu method.
Stacktrace:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/johannespetrat/anaconda2/envs/py37/lib/python3.7/site-packages/keanu/vertex/generated.py", line 162, in ConstantDouble
    return Double(context.jvm_view().ConstantDoubleVertex, constant)
  File "/Users/johannespetrat/anaconda2/envs/py37/lib/python3.7/site-packages/keanu/vertex/base.py", line 21, in __init__
    val = ctor(*(Vertex.__parse_args(args)))
  File "/Users/johannespetrat/anaconda2/envs/py37/lib/python3.7/site-packages/py4j/java_gateway.py", line 1554, in __call__
    answer, self._gateway_client, None, self._fqn)
  File "/Users/johannespetrat/anaconda2/envs/py37/lib/python3.7/site-packages/py4j/protocol.py", line 332, in get_return_value
    format(target_id, ".", name, value))
py4j.protocol.Py4JError: An error occurred while calling None.io.improbable.keanu.vertices.dbl.nonprobabilistic.ConstantDoubleVertex. Trace:
py4j.Py4JException: Constructor io.improbable.keanu.vertices.dbl.nonprobabilistic.ConstantDoubleVertex([class io.improbable.keanu.vertices.dbl.nonprobabilistic.ConstantDoubleVertex]) does not exist
	at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:179)
	at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:196)
	at py4j.Gateway.invoke(Gateway.java:237)
	at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
	at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)

To Reproduce
Steps to reproduce the behaviour:

Create a conda environment with python 3.7.0
Run pip install keanu
Open the python shell
Run import keanu
Run keanu.vertext.ConstantDouble(1.)

Expected behavior
This should create a ConstantDoubleVertex.

Desktop (please complete the following information):

OS: MacOS High Sierra
Python 3.7.0
Anaconda 4.4.10

HalfCauchy vertex (positive support) gets a negative sampled value after NUTS

Describe the bug
Implementing the 8 schools model I noticed that a couple of things seem to go unexpectedly:

The MC gets negative values for TAU, which should only be able to take positive values
Although the MC seems to converge to a negative value (-1.9 in my case) the most likely value assigned to the node is quite different (-5.6 in my case)

To Reproduce
Steps to reproduce the behavior:

Run this model which I adapted from the starter boilerplate (it's the famous non-centered 8 schools - reference: https://docs.pymc.io/notebooks/Diagnosing_biased_Inference_with_Divergences.html):


import io.improbable.keanu.algorithms.NetworkSamples;
import io.improbable.keanu.algorithms.mcmc.NUTS;
import io.improbable.keanu.network.BayesianNetwork;
import io.improbable.keanu.vertices.dbl.DoubleVertex;
import io.improbable.keanu.vertices.dbl.KeanuRandom;
import io.improbable.keanu.vertices.dbl.nonprobabilistic.ConstantDoubleVertex;
import io.improbable.keanu.vertices.dbl.probabilistic.*;

import java.util.*;

public class Eight {

    public static void main(String[] args) {
        Eight model = new Eight();
        model.run();
    }

    private KeanuRandom random;
    public double results;

    Eight() {

    }

    void run() {
        random = new KeanuRandom(42);

        int J = 8;
        double[] yObs = new double[]{28., 8., -3., 7., -1., 1., 18., 12.};
        double[] sigma = new double[]{15., 10., 16., 11., 9., 11., 10., 18.};

        ConstantDoubleVertex sigma_vertex = new ConstantDoubleVertex(sigma);

        DoubleVertex mu = new GaussianVertex(0, 5);
        DoubleVertex tau = new HalfCauchyVertex(5);
        DoubleVertex theta_tilde = new GaussianVertex(new long[]{1, J}, 0, 1);
        DoubleVertex theta = mu.plus(tau.multiply(theta_tilde));
        DoubleVertex y = new GaussianVertex(theta, sigma_vertex);
        y.observe(yObs);

        BayesianNetwork bayesNet = new BayesianNetwork(
                y.getConnectedGraph()
        );

        NUTS sampler = NUTS.builder()
                .adaptCount(100)
                .random(random)
                .build();

        NetworkSamples samples = sampler.getPosteriorSamples(
                bayesNet,
                bayesNet.getLatentVertices(),
                1000
        );

        List tauPosterior = samples.get(tau).asList();

        for (int i = 0; i < tauPosterior.size(); i++) {
            System.out.println(tauPosterior.get(i));
        }

        results = tau.getValue().scalar();

        System.out.println("Most probable value for tau (should be around 2.7): " + results);
    }

}

Expected behavior
Convergence of the tau parameter around a positive value (range: 2.5-3)

Desktop (please complete the following information):

OS: Mac OSX, java 11

Allow MetropolisHastings samples to be dropped as they are generated

Is your feature request related to a problem? Please describe.
I am sampling each layer of a tree consecutively and then downsampling to a every complete iteration of the tree. The issue I am facing is that to take a decent number of samples I first need to store each vertex and every sample for it before downsampling. This uses a lot of memory and once I hit about 80GB memory pressure rises and the system becomes unusable.

Describe the solution you'd like
I would like to be able to downsample as we go. E.g Take around 140 samples (one sample for each tree layer), then drop all but the last one and repeat for the desired sample count. One option would be to add downsampling as an optional parameter to the getPosteriorSamples method.

Describe alternatives you've considered
Other possible solutions would be to allow users to create their own drop selector and pass it in, or to use streams to filter in real time (although this might be more complex).

Geometric distribution/vertex

Would like a geometric vertex https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.geom.html

SliceVertex value has different size than vertex

The SliceVertex size matches the size of the input vertex but with the sliced dimension size 1. The value of the SliceVertex drops that dimension instead of setting it equal to 1.

import io.improbable.keanu.tensor.dbl.DoubleTensor
import io.improbable.keanu.vertices.ConstantVertex

fun main(args: Array<String>) {
    
    val a = ConstantVertex.of(DoubleTensor.eye(2))
    println(a.shape.toList())
    // [2, 2]

    val b = a.slice(0, 0)
    println(b.shape.toList())
    println(b.calculate().shape.toList())
    // [1, 2]; [2]

    val c = a.slice(1, 0)
    println(c.shape.toList())
    println(c.calculate().shape.toList())
    // [2, 1]; [2]

}

The tensor returned by SliceVertex.calculate should have the size equal to SliceVertex.initialShape.

Also, I found some strange behavior when writing this minimal example to produce the bug.

import io.improbable.keanu.tensor.dbl.DoubleTensor
import io.improbable.keanu.vertices.ConstantVertex


fun main(args: Array<String>) {


    val slice = ConstantVertex.of(DoubleTensor.eye(2))
            .slice(1, 0)

    // Raises exception
    try {
        println(slice.plus(ConstantVertex.of(DoubleTensor.ones(2))))
    } catch (err: Exception) {
        println(err.message)
    }

    // Raises a different exception
    try {
        val d = ConstantVertex.of(DoubleTensor.ones(2, 1))
        println(slice.plus(d).calculate())
    } catch (err: Exception) {
        println(err.message)
    }

    // No longer raises exception (but is the same code as the first try-catch block)
    try {
        println(slice.plus(ConstantVertex.of(DoubleTensor.ones(2))))
    } catch (err: Exception) {
        println(err.message)
    }

}

Build broken on windows:

When building on windows (e.g. new clone of master, followed by gradle build) I get a build faliure with errors such as:

:keanu-project:javadocC:\Users\chris.major\Documents\keanu\keanu-project\src\main\java\io\improbable\keanu\algorithms\mcmc\Hamiltonian.java:140: error: unmappable character for encoding Cp1252

Even when viewed in github it looks like the file encoding isn't correct.

`MinimumVertex` and `MaximumVertex`

In one of my applications of Keanu I have found that a vertex calculating the minimum of two input vertices would be really helpful. This could also be used to create upper or lower bounds of vertices.

A MinimumVertex would likely just use if or lessThanOrEqual under the hood, but having this as a designated vertex would provide a nice layer of abstraction.

Implement logging with a specific name rather than using the root logger

Logging calls in base.py call the module functions directly, which log messages under the name "root." It would be better if base.py and snipper_writer.py used a named logger like in context.py.

keanu/keanu-python/keanu/base.py

Line 26 in d2ca830

    
           logging.warning("\"{}\" is not implemented so Java API \"{}\" was called directly instead".format(k, java_name))

keanu/keanu-python/keanu/context.py

Line 30 in 11d95ee

    
           logging.getLogger("keanu").debug("Initiating Py4J gateway with classpath %s" % classpath)

Observing a BoolVertex flattens it into a row vector (shape is lost)

See this line:

keanu/keanu-project/src/main/java/io/improbable/keanu/vertices/bool/BoolVertex.java

Line 81 in f02a8ec

super.observe(BooleanTensor.create(values));

For example:

DoubleTensor probs = DoubleTensor.create(new double[] {0.5, 0.5}, new int[] {2, 1});
BoolVertex outcomes = new BernoulliVertex(new ConstantDoubleVertex(probs));
System.out.println(String.format("%s %s", outcomes.getShape()[0], outcomes.getShape()[1]));
// 2 1
outcomes.observe(new boolean[] {true, false});
System.out.println(String.format("%s %s", outcomes.getShape()[0], outcomes.getShape()[1]));
// 1 2

Not sure if this is limited to boolean vertices or general.
More generally, I think it'd make sense to be able to observe a tensor (of the same shape as the vertex) rather than a flat array.

ExponentialVertex Arguments

Hi,

When I use an exponential vertex I would normally expect to have to input a single parameter (Lambda). However your Exponential Vertex constructor takes two arguments. The issue is that these arguments are named 'a' and 'b' so it is not clear what these arguments relate to. Clear naming and docs would solve the issue nicley.

Thanks,
Mike

Improve error message when trying to use variational methods on a non-continuous network

Is your feature request related to a problem? Please describe.
Obviously trying to use GradientOptimizer on a graph with Poissons in is silly but it would be good to have reliable error messaging as a safety net.

OrMultipleVertex doesn't accept standard Kotlin list

fun bug_b() {
    val cards = ArrayList<BooleanVertex>()
    for (i in 0 until 10) {
        cards.add(BernoulliVertex(0.5))
    }
    
    // compiler error
    OrMultipleVertex(cards)
    // works
    OrMultipleVertex(cards as Collection<BooleanVertex>)
}

`.transpose()` and `.reshape()` for vertices?

I'm sure these are part of the plan, dropping this here as a reminder :)

Easier and future proof way to tell is a vertex is constant.

While analysing a network of Vertices it's not obvious which have constant values.
For example I've got the following code:

    private static boolean isConstant(DoubleVertex vertex) {
        if ( vertex instanceof ConstantDoubleVertex ){
            return true;
        } else if ( vertex instanceof DoubleBinaryOpVertex ){
            DoubleBinaryOpVertex dbv = (DoubleBinaryOpVertex)vertex;
            return isConstant(dbv.getLeft()) && isConstant(dbv.getRight());
        }
        return false;
    }

I think this would be useful as a core part of keanu so it's maintained

Alternatives
The main issue with this would be "custom" vertices from other sources. It may be better to provide an interface that let's code understand a vertex isn't going to change.

Additional context
Eventually this might even allow some automatic optimisations (e.g. cache values if they can never change).

Create easy way to take x samples

Is your feature request related to a problem? Please describe.
Currently if I need to take 100 samples of a random variable I do this:

speedVertex.setValue(new double[100]);
speedVertex.setValue(speedVertex.sample().average());

But this is two lines for a rather simple thing.

Describe the solution you'd like
Some kind of single line method like
speedVertex.sample(100).average()

Trouble running agent-based model using UnaryOpLambda

I am trying to integrate keanu with an agent-based model that uses the MASON toolkit, but having trouble with building the UnaryOpLambda to run the model and the arguments it needs.

The current state of my model (state) can be represented by a vector of all the agents' x and y locations and their max speed ([x0,y0,s0,x1,y1,s1,...,]). We represent this as a ConstantDoubleVertex[].

The UnaryOpLambda should take the current state and call a predict function, which itself takes a state and returns a new state after iterating the model. predict is defined like:

public static ConstantDoubleVertex[] predict(ConstantDoubleVertex[] state) {

  // take 'state' and run the model forward x iterations
  
  return state
}

Here is the problem:


UnaryOpLambda<ConstantDoubleVertex[], ConstantDoubleVertex[]> box =
  new UnaryOpLambda<ConstantDoubleVertex[], ConstantDoubleVertex[]>(
    new long[]{state.length}, state, StationTransition::predict);

Which returns the error:


Error:(153, 113) java: incompatible types: io.improbable.keanu.vertices.dbl.nonprobabilistic.ConstantDoubleVertex[] cannot be converted to io.improbable.keanu.vertices.Vertex<io.improbable.keanu.vertices.dbl.nonprobabilistic.ConstantDoubleVertex[]>

The problem as I understand it is that UnaryOpLambda accepts only a single vertex as its argument. How can I pass an array as the input to UnaryOpLambda? Or is there a better way of doing this?

Searching vertices by ID in network

Is your feature request related to a problem? Please describe.
When debugging graphs with >100 vertices it can get a bit tricky to inspect individual vertices (e.g. by sampling). Sometimes I found myself auto-generating IDs for vertices ('input_vertex_1', 'input_vertex_2', 'parameter_1', etc.) and it would be super useful to just retrieve all vertices with an ID using a wildcard search.

Describe the solution you'd like
A method in the BayesianNetwork class that returns a list of vertices that match a search string.

List<Vertex> search(String searchString)

`setWithMaskInPlace` creates garbage values when mask is smaller than tensor

Describe the bug
Calling setWithMaskInPlace for Integer and Double tensors with a mask not the same length as the tensor produces unexpected behaviours

To Reproduce
Steps to reproduce the behavior:

call setWithMaskInPlace with a mask with a length different than the tensor
if the mask is smaller, then it will create garbage values for the elements it doesn't cover
if the mask is larger, then it will broadcast

Expected behavior
This type of input should not be prohibited.

Cannot use 0.0.25 as dependency because of a typo in slf4j-api transitive dependency version

When I try to add io.improbable:keanu:0.0.25 dependency to my project, I cannot build it because the 'org.slf4j:slf4j-api:1. 8.0-beta2' transitive dependency cannot be resolved.

The problem seems to be in the typo (redundant space) here:
https://github.com/improbable-research/keanu/blob/develop/keanu-project/build.gradle#L35

Tried with sbt first and with Maven then.
With Maven getting the following error during the build:

[ERROR] Failed to execute goal on project cgk: Could not resolve dependencies for project net.shcherbakovs:cgk:jar:0.0.1: Failed to collect dependencies at io.improbable:keanu:jar:0.0.25 -> org.slf4j:slf4j-api:jar:1. 8.0-beta2: Failed to read artifact descriptor for org.slf4j:slf4j-api:jar:1. 8.0-beta2: Could not transfer artifact org.slf4j:slf4j-api:pom:1. 8.0-beta2 from/to ...

DirichletVertex as input to CategoricalVertex

In order to learn the probabilities associated with categories in a CategoricalVertex it would be really useful to supply a Dirichlet distribution as a prior. This is used in Hidden Markov Models for example. ($x_{t=1..t}$ in https://en.wikipedia.org/wiki/Hidden_Markov_model#Examples)

The constructor could look something like this:

double concentration = 0.5;
DoubleVertex dirichlet = new DirichletVertex(new int[]{10}, concentration);
CategoricalVertex categorical = new CategoricalVertex<Integer>(dirichlet)

which would create a categorical distribution over 10 categories.

UniformVertex upper bound is not inclusive

Describe the bug
The UniformVertex allows you to specify upper and lower bounds but the upper bound is exclusive and doing a logProb() at this value will give -Infinity.

To Reproduce

UniformVertex vertex = new UniformVertex(0.0, 1.0);
vertex.setValue(1.0);
vertex.logProbAtValue();

Expected behavior
Upper bound should be inclusive

DoubleTensor.linspace creates float values, not double

The ND4J tensor within the DoubleTensor returned by DoubleTensor.linspace has contents of type FloatBuffer, not DoubleBuffer.

DoubleTensor.linspace(0.0, 1.0, 101)

Converting VertexSamples to DoubleTensor

At the moment it is quite hard to work with samples coming from MH and NUTS because they are in the NetworkSamples (or VertexSamples) format. VertexSamples has an asList() method. Can we please also have a asTensor method?

`set_and_cascade` not working?

When running set_and_cascade() on the inputs of a Gaussian it's values don't get updated. Perhaps I am misunderstanding what set_and_cascade() does, so I added two examples below to show what's going on.

This works:

import keanu as kn
a = kn.vertex.Gaussian(0., 1.)
b = kn.vertex.Gaussian(0., 1.)
c = a + b
a.set_and_cascade(10.)
c.get_value() == a.get_value() + b.get_value()

but this doesn't:

import keanu as kn
a = kn.vertex.Gaussian(0., 1.)
b = kn.vertex.Gaussian(0., 1.)
c = kn.vertex.Gaussian(a + b, 1.)
a.set_and_cascade(10.)
c.get_value() == a.get_value() + b.get_value()

I discovered this because MetropolisHastings didn't seem to update the values of latent vertices in my graph. And btw, I'm using v0.0.20.

Visualise just conditional dependencies in bayes net

When I viz the full computational graph it's difficult to parse. If I want to check the conditional dependencies I've specified without caring about the structured form these take (i.e. the bayes net as often drawn out with subject matter experts) I can't do this easily.

Ability to vis the bayes net with relationships between all random variables, without deterministic operations like additions etc.

Extended function on DoubleTensor

I'm trying to use DoubleTesor and feel it needs these functions:

equalsWithEps
This would allow the comparison between two DoubleTensors with a specified epsilon.
zero
This would set all elements in a DoubleTensor to 0. This may be a special case of a more general setAll function.

ScalarDoubleTensor implementation of matrix Inverse is broken

Describe the bug
The current implementation of 'inverse()' in the ScalarDoubleTensor object is broken in that it just returns a duplicate of the current Tensor instead of inverting.

Calling .averages() on a sample that contains a gamma vertex with a tensor crashes

Describe the bug

 samples.getDoubleTensorSamples(edge.getTotalTimeVertex()).getAverages().lessThan(50).asFlatList().stream().filter(Boolean::booleanValue).count() / (double) TENSOR_SIZE)

Causes the following error when I use a Gamma vertex in my project:

Exception in thread "main" java.lang.NullPointerException
	at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:848)
	at io.improbable.keanu.tensor.INDArrayShim.execBroadcast(INDArrayShim.java:89)
	at io.improbable.keanu.tensor.INDArrayShim.broadcastPlus(INDArrayShim.java:67)
	at io.improbable.keanu.tensor.INDArrayShim.addi(INDArrayShim.java:61)
	at io.improbable.keanu.tensor.dbl.Nd4jDoubleTensor.plusInPlace(Nd4jDoubleTensor.java:558)
	at io.improbable.keanu.tensor.dbl.Nd4jDoubleTensor.plusInPlace(Nd4jDoubleTensor.java:35)
	at java.base/java.util.stream.ReduceOps$1ReducingSink.accept(ReduceOps.java:80)
	at java.base/java.util.Collections$2.tryAdvance(Collections.java:4727)
	at java.base/java.util.Collections$2.forEachRemaining(Collections.java:4735)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:553)
	at io.improbable.keanu.vertices.dbl.DoubleVertexSamples.getAverages(DoubleVertexSamples.java:22)
	at io.improbable.irsim.IRSim.lambda$buildModel$4(IRSim.java:134)
	at io.improbable.irsim.GraphVisualisation.lambda$new$0(GraphVisualisation.java:73)
	at java.base/java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1608)
	at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)
	at io.improbable.irsim.GraphVisualisation.<init>(GraphVisualisation.java:61)
	at io.improbable.irsim.IRSim.buildModel(IRSim.java:130)
	at io.improbable.irsim.RoadBinaryParser.complete(RoadBinaryParser.java:150)
	at org.openstreetmap.osmosis.osmbinary.file.BlockInputStream.process(BlockInputStream.java:37)
	at io.improbable.irsim.IRSim.main(IRSim.java:79)

To Reproduce
I think this could be reproduced by using a Gamma vertex with a tensor and then sampling it and calling .getAverages()

Expected behavior
It doesn't null pointer.

Add ability to calculate probability on tensors

Is your feature request related to a problem? Please describe.
If I have a tensor containing 20 values and I want to know how many of them are less than X.

Describe the solution you'd like
I would like to be able to call .probablity(lambda) on the tensor like you can with DoubleTensorSamples at the moment.

Nd4j greaterThanOrEqual(IntegerTensor value) behaves like #greaterThan when value is a scalar

Describe the bug
Nd4jInteger#greaterThanOrEqual(IntegerTensor) and Nd4jDouble#greaterThanOrEqual(DoubleTensor) behave like greaterThan when you pass Nd4j{Integer, Double}Tensor.scalar(int) as an argument.

To Reproduce
Steps to reproduce the behavior:

Create two Nd4j tensors, where the second one has only 1 dimension. Make sure there are values that are equal between the two tensors.
call greaterThanOrEqual on first tensor and pass the scalar tensor as an argument

Expected behavior
The expected result is a boolean tensor with true for equal values

Broken links in readme

Python doc and future links are broken (404)

Why Java?

Hi guys,
I've just a random question - why Java? I understand why you'd like to use the JVM - but why not Clojure or Scala?

NoClassDefFoundError when observing in python Keanu

Describe the bug
There seems to be an error with nd4j that pops up whenever I want to observe a variable. Created a simple program below that observes a Gaussian.

Exception in thread "Thread-3" java.lang.ExceptionInInitializerError
	at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.<init>(NativeOpExecutioner.java:71)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at java.lang.Class.newInstance(Class.java:442)
	at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:5587)
	at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5482)
	at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:215)
	at io.improbable.keanu.tensor.INDArrayShim.lambda$startNewThreadForNd4j$0(INDArrayShim.java:52)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: ND4J is probably missing dependencies. For more information, please refer to: http://nd4j.org/getstarted.html
	at org.nd4j.nativeblas.NativeOpsHolder.<init>(NativeOpsHolder.java:68)
	at org.nd4j.nativeblas.NativeOpsHolder.<clinit>(NativeOpsHolder.java:36)
	... 11 more
Caused by: java.lang.UnsatisfiedLinkError: no jnind4jcpu in java.library.path
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
	at java.lang.Runtime.loadLibrary0(Runtime.java:870)
	at java.lang.System.loadLibrary(System.java:1122)
	at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1258)
	at org.bytedeco.javacpp.Loader.load(Loader.java:999)
	at org.bytedeco.javacpp.Loader.load(Loader.java:891)
	at org.nd4j.nativeblas.Nd4jCpu.<clinit>(Nd4jCpu.java:10)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.bytedeco.javacpp.Loader.load(Loader.java:950)
	at org.bytedeco.javacpp.Loader.load(Loader.java:891)
	at org.nd4j.nativeblas.Nd4jCpu$NativeOps.<clinit>(Nd4jCpu.java:1613)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.nd4j.nativeblas.NativeOpsHolder.<init>(NativeOpsHolder.java:46)
	... 12 more
Caused by: java.lang.UnsatisfiedLinkError: /Users/johannespetrat/.javacpp/cache/nd4j-native-1.0.0-beta3-macosx-x86_64.jar/org/nd4j/nativeblas/macosx-x86_64/libjnind4jcpu.dylib: dlopen(/Users/johannespetrat/.javacpp/cache/nd4j-native-1.0.0-beta3-macosx-x86_64.jar/org/nd4j/nativeblas/macosx-x86_64/libjnind4jcpu.dylib, 1): Library not loaded: @rpath/libmkldnn.0.dylib
  Referenced from: /Users/johannespetrat/.javacpp/cache/nd4j-native-1.0.0-beta3-macosx-x86_64.jar/org/nd4j/nativeblas/macosx-x86_64/./libnd4jcpu.dylib
  Reason: image not found
	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
	at java.lang.Runtime.load0(Runtime.java:809)
	at java.lang.System.load(System.java:1086)
	at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1238)
	... 23 more
Traceback (most recent call last):
  File "python/black-scholes/sandbox.py", line 15, in <module>
    output.observe(realMu + np.random.randn(1000))
  File "/Users/johannespetrat/.local/share/virtualenvs/applied-keanu-RuPOvGbC/lib/python3.6/site-packages/keanu/vertex/base.py", line 42, in observe
    self.unwrap().observe(Tensor(self.cast(v)).unwrap())
  File "/Users/johannespetrat/.local/share/virtualenvs/applied-keanu-RuPOvGbC/lib/python3.6/site-packages/keanu/tensor.py", line 23, in __init__
    super(Tensor, self).__init__(Tensor.__get_tensor_from_ndarray(t))
  File "/Users/johannespetrat/.local/share/virtualenvs/applied-keanu-RuPOvGbC/lib/python3.6/site-packages/keanu/tensor.py", line 49, in __get_tensor_from_ndarray
    return ctor(values, shape)
  File "/Users/johannespetrat/.local/share/virtualenvs/applied-keanu-RuPOvGbC/lib/python3.6/site-packages/py4j/java_gateway.py", line 1286, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/Users/johannespetrat/.local/share/virtualenvs/applied-keanu-RuPOvGbC/lib/python3.6/site-packages/py4j/protocol.py", line 328, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling z:io.improbable.keanu.tensor.dbl.DoubleTensor.create.
: java.lang.NoClassDefFoundError: Could not initialize class org.nd4j.linalg.factory.Nd4j
	at io.improbable.keanu.tensor.TypedINDArrayFactory.create(TypedINDArrayFactory.java:12)
	at io.improbable.keanu.tensor.dbl.Nd4jDoubleTensor.<init>(Nd4jDoubleTensor.java:58)
	at io.improbable.keanu.tensor.dbl.Nd4jDoubleTensor.create(Nd4jDoubleTensor.java:70)
	at io.improbable.keanu.tensor.dbl.DoubleTensor.create(DoubleTensor.java:40)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)

To Reproduce

import keanu as kn
import numpy as np

if __name__ == '__main__':
    mu = kn.vertex.Gaussian(0.0, 0.1)
    output = kn.vertex.Gaussian(mu, 0.1)

    mu.set_label("mu")
    output.set_label("output")

    realMu = 1.
    output.observe(realMu + np.random.randn(1000))
    net = kn.BayesNet(output.get_connected_graph())

Expected behavior
The numpy array should be converted to a keanu tensor.

Desktop (please complete the following information):

OS: macOS High Sierra (10.13.3)
Keanu Version 0.0.18
Python version: 3.6.4
Using pipenv

getVertices on BayesNet

Trying to get the vertices within a BayesNet to calculate metrics on the graph. There currently isn't an exposed method on BayesNet to get all of the vertices. Can the private getVertices method be made public?

Adding vertex for transition matrices

For models with a time-step (such as Hidden Markov Models) it is often necessary to implement a transition matrix. If we had three states A, B and C and p_AB is the probability of transitioning from A to B then we can describe our transition matrix as
$\Bigl(\begin{matrix} p_{AA} & p_{AB}&p_{AC} \\ p_{BA} & p_{BB}&p_{BC} \\ p_{CA} & p_{CB}&p_{CC} \end{matrix}\Bigr) \\ \\ \text{where } p_{AA} + p_{AB} + p_{AC} = 1$

This would be really similar to the current implementation of the DirichletVertex. In the DirichletVertex all probabilities sum to 1, but for a transition matrix we only want the rows to sum to 1.

Include Shape in SaveLoad interface

Describe the bug
I have a vertex whose shape I define manually. This allows me to do a correct matrix multiplication when I define my Bayesian Network. Unfortunately, shape information is not stored in the protobuf saved version so when I save and load my graph, I get an error because the matrix multiplication doesn't like the shape of my vertex.

Expected behavior
I would like to be able to save and load Bayesian Networks which include vertices that are reliant on shape information without getting errors.

Getting the Keanu version

Trying to record metrics of particular runs of a model. It would be nice to associate them with a specific Keanu version. For example, plotting the execution time of a model against the Keanu version.

Tensorflow exposes 'tf.version'. Can the Keanu version could be baked into releases and made accessible by a similar method?

As an alternative we are going to have the version hardcoded. However this will rely on it being updated each time Keanu is updated.

.softmax() for tensors and vertices

We already have a sigmoid() method for tensors and for implementing the softmax. This is a generalisation of the sigmoid mapping a vector to values between 0 and 1 so that they sum to 1.

The softmax is commonly used for multi-class classification problems as the output can be interpreted as the predicted probabilities of each class.

I think the method could look like softmax(int axis) so that all values are mapped to [0,1] and sum to 1 along axis.

Docs describe `SelectVertex` which doesn;t exist

Describe the bug
https://github.com/improbable-research/keanu/blob/develop/keanu-docs/docs/vertices.md lists an example for a SelectVertex - yet it doesn't exist in code.

To Reproduce
Steps to reproduce the behavior:

Go to 'https://github.com/improbable-research/keanu/blob/develop/keanu-docs/docs/vertices.md'
Look at the code example return new SelectVertex<MyType>(frequency);
Try to use in latest version
Does not compile

Expected behavior
I think the functionality is actually in CategoricalVertex so might just be a docs update

LinearRegression and LogisticRegression input shapes

Not really a bug, but the constructors of LinearRegression and LogisticRegression take x of shape (n_features, n_samples). At least in python most libraries use input shape (n_samples, n_features) so most modellers will expect this format.

Set default values in the MetropolisHastings builder

Is your feature request related to a problem? Please describe.
When creating a new instance of MetropolisHastings via MetropolisHastings.builder(). All of the options must be set by the user or else they will be null. This means that for example you must call .random(new KeanuRandom()) or else crash with a null pointer when attempting to sample.

Exception in thread "main" java.lang.NullPointerException
	at io.improbable.keanu.distributions.continuous.Triangular.sample(Triangular.java:26)
	at io.improbable.keanu.distributions.continuous.Triangular.sample(Triangular.java:8)
	at io.improbable.keanu.vertices.dbl.probabilistic.TriangularVertex.sample(TriangularVertex.java:117)
	at io.improbable.keanu.vertices.dbl.probabilistic.TriangularVertex.sample(TriangularVertex.java:14)
	at io.improbable.irsim.TreeProposal.setFor(TreeProposal.java:26)
	at io.improbable.irsim.TreeProposal.getProposal(TreeProposal.java:15)
	at io.improbable.keanu.algorithms.mcmc.MetropolisHastingsStep.step(MetropolisHastingsStep.java:74)
	at io.improbable.keanu.algorithms.mcmc.MetropolisHastingsStep.step(MetropolisHastingsStep.java:53)
	at io.improbable.keanu.algorithms.mcmc.MetropolisHastings.getPosteriorSamples(MetropolisHastings.java:71)
	at io.improbable.irsim.RoadBinaryParser.complete(RoadBinaryParser.java:137)
	at org.openstreetmap.osmosis.osmbinary.file.BlockInputStream.process(BlockInputStream.java:37)
	at io.improbable.irsim.IRSim.main(IRSim.java:29)

Describe the solution you'd like
Default values should be used so things such as a KeanuRandom() do not have to be passed in if not required.

Describe alternatives you've considered
An alternative solution might be to make the default configs settings mutable.

Exponential Vertex with Non-Tensor Hyper-parameters doesn't handle Tensor dLogProb correctly

Describe the bug
The Exponential Vertex is lacking a test similar to matchesKnownDerivativeLogDensityOfVector. When added, the test fails. This is because some of the partial derivatives only generate a Scalar (due to the initial shape of the hyper-parameters) instead of the expected Tensor.

To Reproduce
Create an exponential vertex with scalar hyper-parameters and then retrieve logProb for a vector of Values - incorrect values will be returned.

Expected behavior
We'd expect a sum for the dLogProb across the full tensor, but instead you simply receive the value for the first entry.

Continuous and discrete uniform logProb for boundary values is incorrect

Describe the bug
logprob for UniformVertex and UniformIntVertex returns 0 probability for xMin and a non-zero probability for xMax.

To Reproduce
Steps to reproduce the behavior:

Compute logProb at its boundaries for continuous and discrete uniform vertices

Expected behavior
According to the docs, continuous and discrete uniform vertices interprets xMin to be inclusive and xMax to be exclusive (i.e. logprob should return 0 probability for xMax and a non-zero probability for xMin).

NetworkSamples streaming never updates network state

networkSamples.generate(N) does not produce the same output as networkSamples.stream().limit(N).toList() For MetropolisHastings.

I believe the problem is that in MetropolisHastings.java lines 317 - 146:

        @Override
        public void sample(Map<Long, List<?>> samplesByVertex) {
            step();
            takeSamples(samplesByVertex, verticesToSampleFrom);
        }

        @Override
        public NetworkState sample() {
            return new SimpleNetworkState(takeSample(verticesToSampleFrom));
        }

the first method (called by generate(N)) calls step() before takeSamples but the second doesn't.

Dotsaver only saves first set of union of unconnected vertices

A list contains sets of vertices. The sets are not immediately connected to each other, but will be further in the program logic. To apply them to a bayesian network, which accept only one set of vertices, their union is created.

Expected behavior:
Dotsaver should save the full bayesian network containing all sets of vertices.

Actual behavior:
Dotsaver only saves the first connected set of vertices, ignoring the rest.

digraph BayesianNetwork {
<112> -> <113> [label=max]
<148> -> <149> [label=right]
<150> -> <151> [label=right]
<111> -> <113> [label=min]
<113> -> <149> [label=left]
<113> -> <151> [label=left]
113[label="UniformIntVertex"]
148[label="0"]
149[label="EqualsVertex"]
150[label="1"]
151[label="EqualsVertex"]
111[label="0"]
112[label="2"]
}

Code:

fun bug_a() {
    val cards = ArrayList<IntegerVertex>()
    for (i in 0 until 10) {
        cards.add(UniformIntVertex(0, 2)) // player 0 or 1
    }
    for (i in 0 until 5) {
        cards[i].observe(0) // ai's card
    }
    for (card in cards) {
        for (i in 0 until 2) {
            card.equalTo(ConstantVertex.of(i))
        }
    }
    val dotsaver = DotSaver(BayesianNetwork(cards))
    dotsaver.save(FileOutputStream(File("Dotfile")), false)
}

Can't observe an IntegerTensor in CategoricalVertex

It would be nice to add functionality to observe multiple values in an IntegerTensor for a CategoricalVertex to infer the probabilities of the categories.

Allow some sort of access to internal nd4j reprensentation of `DoubleTensor`

I'm trying to convert some existing nd4j code into a keanu vertex.
As the existing code expects an INDArray as input, I'd like to convert the DoubleTensor into one - I an see it's internal representation is the same, but can't access it as it's private.

Could we have an getInternalRepresentation function on the DoubleTensor
This would allow:

if ( tensor instanceof Nd4jDoubleTensor){
  Nd4jDoubleTensor internal = (Nd4jDoubleTensor)internal;
  INDArray represntation = internal.getInternalRepresentation();
  // do something efficient
}else{
  // do something inefficient
}

the implementation is fairly straightforward

public INDArray getInternalRepresentation(){
    return tensor;
}

Alternatives
This isn't a deal breaker - but copying via the CPU would impact speed significantly.
Obviously this risks unexpected access to the backing data for a tensor - if this is a blocked then maybe a function that would allow copying into another INDArray would ensure it's duplicated but not changed.

I've raised a PR for this change at #129

improbable-research / keanu Goto Github PK

keanu's People

Contributors

Stargazers

Watchers

Forkers

keanu's Issues

Recommend Projects

Recommend Topics

Recommend Org