The spidr from joelberkeley

Type signature for `Acquisition` uses a proof. Should it?

In Acquisition, is there a simpler way (e.g. using S), to require batch_size is positive while actually passing in the batch size and not batch_size - 1 which would be a weird API?

Implement vanilla GP Bayes for no training data

This is for completeness, since it's perfectly reasonable to do this w/o any data.

Reuse constructed `XlaOp`s rather than rebuilding them every time

When a Tensor is used more than once, e.g. x in x = 1; y = x + x, a new XlaOp is created for every usage, rather than using reusing the XlaOp. This wastes time in constructing, and likely compiling, the graph. It also may use more memory than necessary.

Refactor the internals of Tensor and Tensor ops such that values are reused.

Build (and deploy?) documentation from Idris docstrings

See idris-lang/Idris2#53 for status on Idris2 support for building docs

Is `ArrayLike` too flexible?

ArrayLike can accept any data type, including Vects (e.g. an ArrayLike [3, 4, 5] Double is the same as an ArrayLike [3, 4] (Vect 5 Double)). This may make type inference awkward with no known benefit. Is it possible and beneficial to restrict the definition?

Integer width handling is not well defined

Integer width is relevant in a number of points in the chain of constructing and using Tensors:

the Idris data types we use (Integer or Int or Int32 etc.)
how those data types are interpreted by C (and how Idris interprets C types)
how those C types are converted to XLA primitives (and back)
how we handle under- and overflow in operations, and how we test for expected outcomes (whether Idris does under- and overflow in the same way)

Make `Broadcastable` constructors strictly orthogonal

Strictly orthogonal Broadcastable constructors may make implementations easier

An `Isomorphic` type for isomorphic shapes?

In some case it's useful to be able to say "this thing works for this shape and any shapes isomorphic to it". Isomorphic shapes include those with extra or fewer dimensions of length 1 e.g. [3, 1, 2], [3, 2] and [1, 3, 2]. It is essentially a cross between Squeezable and a subset of Broadcastable.

A use case I came across was ClosedFormDistribution for Gaussian. This is implemented for event shape [1], but could equally be implemented for [] or [1, 1], with sth like

Isomorphic shape [] => ClosedFormDistribution shape Gaussian where
  ...

and we'd presumably reshape within the implementation as appropriate. Similar could be done for GaussianProcess so that we don't need targets = [1]

Dedicated operators for elementwise ops?

Do we want a dedicated operator for element-wise multiplication, division etc, so that x * y is always
the mathematical version, and readers can differentiate between that and, say, x *# y for element-wise multiplication?

A well-founded `Broadcastable`

I suspect there is well-founded logic behind what shapes can and can't be broadcast to others. It would be really nice to have Broadcastable use that logic, so that we know it's complete and why (and perhaps even prove it's complete?), as well as other potential niceties like better type inference, unique values for each pair of from and to shapes etc.

Implement numeric interfaces for scalar tensors

Scalar Tensor values have a natural interpretation as numerics. We could implement the numeric interfaces Num, Neg, Abs etc. for these.

Add broadcasting functionality: `shape` to `n :: shape`

So that we can begin to use broadcasting, add the basic definition of broadcasting, such that code type checks for adding an axis of any size : shape -> n :: shape for any n

Add Poplibs build to CI

Run Poplibs in the CI run. This helps towards being able to call Poplibs from Idris.

Instructions for installing Poplar

Cache Cholesky in Gaussian process regression

The Cholesky factor can be reused for inference and for calculating the marginal likelihood. Write an implementation that takes this into account so it's not recalculated.

This may help https://gregorygundersen.com/blog/2019/09/12/practical-gp-regression/

Implement L-BFGS optimizer

Requires #105

Implement grid search

Install Idris 2 in CI

see title

Add test cases for inf and nan

Many Tensor tests don't test with inf and nan values. This is particularly true for hard-coded test cases (often arrays). All ops that use Double should have such tests.

How to name in-place division and inequality?

The standard operator for in-place division is /= in other languages. In Idris, that operator is used for inequality. How do we resolve this?

Set up Idris tests for CI

Add at least two Idris 2 tests to CI. Can be of really simple functions like \x => x + 1 and \x => -x so that it's easy to add tests for new Idris functionality.

Tutorial on Gaussian process inference

Write and implement a tutorial on Gaussian process inference and how it is designed in spidr. We can do this as a latex literate file and include equations for the code we're using, then build it and publish it to the website if possible.

Add XLA build to CI

Run XLA in the CI run. This helps towards being able to call XLA from Idris.

Allow broadcasting to match dims from different axes?

The XLA docs explain how a tensor of shape [n, m] can be broadcast to [p, n, m] or a [n, p, m] when the axes to match with are specified. Do we want to support this? As of writing, this kind of broadcasting is possible in spidr by expanding dims then broadcasting.

Qus:

would it simplify or complicate APIs?
would it improve performance?
how much would people actually use this?

Choose infix operator precedence (and associativity?)

The author didn't know how to choose operator precedence when defining new operators. As such, they we all guess. Go through all of them and find a good precedence for all of them.

Should also check associativity too, though there was less guessing involved there.

Call C function from Idris 2

Call a simple C function (e.g. \x => x + 1) from Idris 2

Run minimal backend example from pure C

To the end of running the backend from Idris, it would help to be able to run an example from pure C.

Idris version number is string but shouldn't be

Idris version number in package description file is a string because the compiler won't accept it in the format illustrated in the docs. Have raised this with Idris devs: idris-lang/Idris2#1373

Compiler module is publicly importable

The Compiler module should only be used to provide implementation for Tensor functionality. It shouldn't be importable by client code.

Support complex numbers

Support complex numbers (integral and/or floating point depending on what XLA supports).

Implement Einstein summation

It may be possible to use dependent types to implement Einstein summation (as per tf.einsum) with no runtime overhead and type-checked index grammar ("...ij -> ...ji" etc.)

Implement SGD

This would introduce gradient descent algorithms into spidr

Implement Gaussian process classification

Requires #67 . Possibly not needed if BNNs or neural processes cover this already

Run Poplibs on CI

In order to call into Poplibs from Idris, we want to install Poplibs as part of the CI run

Implement `fromInteger` and `fromDouble` for scalars

As a user, it would be nice to be able to write scalar tensors using numeric literals. This would be possible if we implemented fromInteger and fromDouble. Bear in mind that this may confuse the type checker, as e.g. 1 * 2 will be ambiguous.

Investigate SWIG etc. for C code generation

It may be better if we autogenerate the C wrapper for XLA. SWIG is one option for this

Likelihood mean is unused in posterior calculation

The likelihood mean is unused in the calculation of the GP posterior. This is error-prone and counterintuitive. Options include using it, or only accepting the likelihood variance in the arguments

Implement Gaussian process regression with non-Gaussian likelihood

Via variational inference or warped GPs, or some other method. Possibly not needed if BNNs or neural processes cover this already

How to handle failure for mathematical ops?

When mathematical ops fail, should they return an Error or do something else? How does XLA handle failure for maths ops?

Matrix multiplication API is not ergonomic

Idris is unable to resolve the shapes for matrix multiplication operator @@, and specifying the shapes is highly unergonomic

Implement automatic differentiation

Research how to implement autograd.

We could do this be hand-writing the derivative of each op, though it would be best if we could ensure that grad can only be used if the entire composed function has gradients implemented throughout.

One possibility to do this is to add an additional data constructor for Tensor, where the first pattern corresponds to op on the value, and latter to the derivative of the op on the value. Totality may then ensure derivatives are available throughout. Additionally, this may work well with functions of multiple variables.

References

Provably Correct, Asymptotically Efficient, Higher-Order Reverse-Mode Automatic Differentiation

joelberkeley / spidr Goto Github PK

spidr's People

Contributors

Stargazers

Watchers

Forkers

spidr's Issues

Recommend Projects

Recommend Topics

Recommend Org