Coder Social home page Coder Social logo

joelberkeley / spidr Goto Github PK

View Code? Open in Web Editor NEW
65.0 2.0 3.0 1.22 MB

Accelerated machine learning with dependent types

License: Apache License 2.0

Idris 77.18% C++ 14.88% Starlark 2.75% C 4.53% Shell 0.66%
dependent-types linear-algebra machine-learning probabilistic-modelling

spidr's People

Contributors

joelberkeley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

spidr's Issues

Reuse constructed `XlaOp`s rather than rebuilding them every time

When a Tensor is used more than once, e.g. x in x = 1; y = x + x, a new XlaOp is created for every usage, rather than using reusing the XlaOp. This wastes time in constructing, and likely compiling, the graph. It also may use more memory than necessary.

Refactor the internals of Tensor and Tensor ops such that values are reused.

Is `ArrayLike` too flexible?

ArrayLike can accept any data type, including Vects (e.g. an ArrayLike [3, 4, 5] Double is the same as an ArrayLike [3, 4] (Vect 5 Double)). This may make type inference awkward with no known benefit. Is it possible and beneficial to restrict the definition?

Integer width handling is not well defined

Integer width is relevant in a number of points in the chain of constructing and using Tensors:

  • the Idris data types we use (Integer or Int or Int32 etc.)
  • how those data types are interpreted by C (and how Idris interprets C types)
  • how those C types are converted to XLA primitives (and back)
  • how we handle under- and overflow in operations, and how we test for expected outcomes (whether Idris does under- and overflow in the same way)

An `Isomorphic` type for isomorphic shapes?

In some case it's useful to be able to say "this thing works for this shape and any shapes isomorphic to it". Isomorphic shapes include those with extra or fewer dimensions of length 1 e.g. [3, 1, 2], [3, 2] and [1, 3, 2]. It is essentially a cross between Squeezable and a subset of Broadcastable.

A use case I came across was ClosedFormDistribution for Gaussian. This is implemented for event shape [1], but could equally be implemented for [] or [1, 1], with sth like

Isomorphic shape [] => ClosedFormDistribution shape Gaussian where
  ...

and we'd presumably reshape within the implementation as appropriate. Similar could be done for GaussianProcess so that we don't need targets = [1]

Dedicated operators for elementwise ops?

Do we want a dedicated operator for element-wise multiplication, division etc, so that x * y is always
the mathematical version, and readers can differentiate between that and, say, x *# y for element-wise multiplication?

A well-founded `Broadcastable`

I suspect there is well-founded logic behind what shapes can and can't be broadcast to others. It would be really nice to have Broadcastable use that logic, so that we know it's complete and why (and perhaps even prove it's complete?), as well as other potential niceties like better type inference, unique values for each pair of from and to shapes etc.

Add test cases for inf and nan

Many Tensor tests don't test with inf and nan values. This is particularly true for hard-coded test cases (often arrays). All ops that use Double should have such tests.

Set up Idris tests for CI

Add at least two Idris 2 tests to CI. Can be of really simple functions like \x => x + 1 and \x => -x so that it's easy to add tests for new Idris functionality.

Tutorial on Gaussian process inference

Write and implement a tutorial on Gaussian process inference and how it is designed in spidr. We can do this as a latex literate file and include equations for the code we're using, then build it and publish it to the website if possible.

Add XLA build to CI

Run XLA in the CI run. This helps towards being able to call XLA from Idris.

Allow broadcasting to match dims from different axes?

The XLA docs explain how a tensor of shape [n, m] can be broadcast to [p, n, m] or a [n, p, m] when the axes to match with are specified. Do we want to support this? As of writing, this kind of broadcasting is possible in spidr by expanding dims then broadcasting.

Qus:

  • would it simplify or complicate APIs?
  • would it improve performance?
  • how much would people actually use this?

Choose infix operator precedence (and associativity?)

The author didn't know how to choose operator precedence when defining new operators. As such, they we all guess. Go through all of them and find a good precedence for all of them.

Should also check associativity too, though there was less guessing involved there.

Implement Einstein summation

It may be possible to use dependent types to implement Einstein summation (as per tf.einsum) with no runtime overhead and type-checked index grammar ("...ij -> ...ji" etc.)

Implement SGD

This would introduce gradient descent algorithms into spidr

Run Poplibs on CI

In order to call into Poplibs from Idris, we want to install Poplibs as part of the CI run

Implement `fromInteger` and `fromDouble` for scalars

As a user, it would be nice to be able to write scalar tensors using numeric literals. This would be possible if we implemented fromInteger and fromDouble. Bear in mind that this may confuse the type checker, as e.g. 1 * 2 will be ambiguous.

Implement automatic differentiation

Research how to implement autograd.

We could do this be hand-writing the derivative of each op, though it would be best if we could ensure that grad can only be used if the entire composed function has gradients implemented throughout.

One possibility to do this is to add an additional data constructor for Tensor, where the first pattern corresponds to op on the value, and latter to the derivative of the op on the value. Totality may then ensure derivatives are available throughout. Additionally, this may work well with functions of multiple variables.

References

Provably Correct, Asymptotically Efficient, Higher-Order Reverse-Mode Automatic Differentiation

Run minimal backend example from Idris

This is a major milestone in the project, where we can run some backend code from Idris. The example can be of any size, as long as it's possible to verify it's working.

Implement NN with uncertainty estimates

It would be illustrative to compare a Gaussian process implementation to a neural network implementation with uncertainty estimates, both from a design perspective but also a performance perspective.

Use NN with uncertainty estimates in BO tutorial

Implement a ProbabilisticModel for whatever was implemented in #62, and use it in the Bayesian optimization tutorial for the failure data. Whether it's appropriate for that usage is probably not too important right now

An alternative is a neural process

Duplicate target shape required for `Empiric` and `ProbabilisticModel`

Usages of Empiric and ProbabilisticModel require the target shape to be specified even when it can be derived from the distribution event shape.

Note: the compiler balks if the explicit target shape does not match that in the distribution, which implies it knows they need to be the same shape.

Implement category-theoretic interfaces for `Tensor`

If we implement standard category-theoretic interfaces, such as Functor, Foldable and SemiGroup, this will make spidr more easily usable in the idris ecosystem, as well as provide the standard benefits of such interfaces.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.