crystal-data / num.cr Goto Github PK

View Code? Open in Web Editor NEW

146.0 10.0 14.0 5.26 MB

Scientific computing in pure Crystal

License: MIT License

Crystal 99.98% Dockerfile 0.02%

crystal linear-algebra machine-learning autograd tensors gpu-accelerated-routines

num.cr's People

Contributors

Stargazers

Watchers

Forkers

gitter-badger iainmon jtanderson orangesi hugoabonizio andrewrgarcia lbarasti gxm11 kojix2 nogginly 0alcredoveska lucianbuzzo stakach

num.cr's Issues

Website is down (Dom Reg or DNS)

First link in bottle docs is to CrystalData.org which is not responding, possibly due to problem with DNS configuration or domain registration.

Standard backend for all Tensors

Right now, all types of Tensors are implemented separately, so ClTensors and Tensors have entirely separate implementations. There are plenty of other backends I would like to support:

Arrow
Cuda
Vulkan

Ideally, these should not require multiple implementations, but rather be created as interfaces that can all be developed in a single class.

This would be a major overhaul to existing implementations, but I think Tensor(T) should become Tensor(T, V), where T is a base Crystal data type, and V is a backend implementation.

This would also have to not break Num::Grad and Num::NN, which would mean that all gates would work across all backends, or at least raise unimplemented errors at compile time vs. runtime.

Implement Num::NN without automatic differentiation

As the Num::NN module expands, especially to support many more complex layer types and network types, keeping the automatic differentiation up to date with each of these layers will make maintenance harder than I think it is worth. I'm already running into this with RNN, GRU, and LSTM layers, where it's much easier to not worry about deriving multiple hidden states. I think it is good to keep Num::Grad, to allow users to build their own networks, but also provide a much more flexible option.

Potentially while doing this, it might be worth it to make Num::Grad::Variable a true wrapper around Tensor, to make syntax more similar.

`matmul` not Defined for Float * Complex

Code Sample, a copy-pastable example if possible

require "num"

include Num

alias RT64 = Tensor(Float64, CPU(Float64))
alias CT64 = Tensor(Complex, CPU(Complex))

complexMatrix = RT64.random(0.0..1.0, [3, 8])
realVector    = RT64.random(0.0..1.0, [8   ])

complexMatrix *= (1 + 1.i) # Didn't get random working, so cheat here.

result = complexMatrix.matmul(realVector)

pp result

Problem description

matmul (and possibly other operations, not managed to check yet) only seems to be defined for operand matrices of the same data type. Multiplying a complex and a float is well-defined on an element-by-element basis, as would be multiplication of a real an an integer, so I reckon an overload should be provided for cases where the underlying element types are different but compatible.

`log` and `tanh` not defined for `Num::Grad::Variable`

I'm working on some toy models using num.cr while following some PyTorch examples. As I was setting up the forward calculations using a gradient variable, I needed log and tanh functions which are currently not implemented.

CBlas import error

Hello,
It seems there is a simple error when executing the following code:

# Your code here
require "num"
require "complex"

a1 = Complex.new(9,0)
a2 = Complex.new(6,0)
a3 = Complex.new(12,0)
a4 = Complex.new(8,0)

y =  [[a1,a2], [a3,a4]]
x = y.to_tensor.as_type(Complex)
puts x

begin
    puts x.inv
rescue ex
  puts ex.message
end

puts "continued"

Problem description

I get the following error:

In lib/num/src/linalg/work.cr:22:30

 22 | def get_cmplx(n) : Slice(LibCBLAS::ComplexDouble)
                               ^----------------------
Error: undefined constant LibCBLAS::ComplexDouble

Did you mean 'LibCblas'?

When I change the import to LibCblas instead of LibCBLAS it works as normal.

Thanks in advance

Save Num::NN networks

All gates should be able to be serialized and read back in to save trained models to disk.

Using GPU

Having fun with this package, but cannot figure out how to move Tensors onto GPU instead of CPU. When I call tensor.opencl -> it stays on CPU.

yes I have a functioning clblas.

Support matrix indexing with a list of rows and a slice of columns and vice versa.

Right now, there is no way to do the following index:

m = Matrix.new [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
m[[0, 1], 1...3]

 # should return a matrix
 # [[2.0, 3.0]
 #  [5.0, 6.0]]

This should be a COPY, not a view, since it can be non-contiguous.

.to_tensor method for arrays?

Would it be a good idea to add in a [1, 2, 3].to_tensor method for arrays? Maybe .to_vector and .to_matrix similar to this repository? It would be a lot easier and prettier than Tensor.from_array [1, 2, 3].
https://github.com/ruivieira/crystal-gsl/blob/fa8ea796f06c77add4bb092be7bbdd93794027d6/src/gsl/base/array.cr

Add JSON initialization methods

It would be great to have the ability to convert a Tensor to JSON and vice versa. I know I can go the Tensor -> Array -> JSON route, but it should be pretty simple to add.

Transpose before matmul causes malformed result

This is a bit of an odd one :) Here is some code to first create a 3-by-2, transpose it, then scale each column. However, the middle one gets swapped. This only happens when transpose is involved; when the 2-by-3 is created directly, things function correctly.

a = [[0,1], [0,1], [0,1]].to_tensor.astype(Float64)
puts a
#  [[0, 1],
#  [0, 1],
#  [0, 1]]

a = a.transpose
puts a
# [[0, 0, 0],
#  [1, 1, 1]]

b = Tensor.diag([1,2,3].to_tensor.astype(Float64))
puts b
# [[1, 0, 0],
#  [0, 2, 0],
#  [0, 0, 3]]

puts a.matmul(b)
# [[0, 2, 0],
#  [1, 0, 3]]

## Expected output:
# [[0, 0, 0],
#  [1, 2, 3]]

Again, this works correctly when a = [[0, 0, 0], [1, 1, 1]].to_tensor.astype(Float64).

First class array implementation in Crystal

Even using gsl, Crystal beats some of the naive vector operations that don't drop to blas, but just use gsl, due to the overhead of copying/access. I think it's worth it to just implement the array class in Crystal

Sparse matrices

Add in basic CSR, CSC and COO sparse matrices, backed by a sparse blas implementation.

WIP #45

Add grad gates for trigonometric functions

It would be helpful to have gates for trig functions with simple derivatives, such as sin, cos, etc. The general pattern can be viewed in grad.arithmetic_ops. A Gate must have a backward method, and a cache method. So for example, the sin, gate would look like:

class Num::Grad::SinGate(T) < Num::Grad::Gate(T)
  getter a : Num::Grad::Variable(T)

  def initialize(@a : Num::Grad::Variable(T))
  end

  def backward(payload : Num::Grad::Payload(T)) : Array(T)
    gradient = payload.variable.grad
    r0 = gradient.map(a.value) do |i, j|
      i * Math.cos(j)
    end
    [r0]
  end

  def cache(result : Num::Grad::Variable(T), *args)
    a = args[0]
    result.grad = T.zeros_like(result.value)
    result.requires_grad = true
    Num::Grad.register("Sin", self, result, a)
  end
end

After the gate is created, an operator should be added directly to Variable, calling this function and cacheing it on a context:

class Num::Grad::Variable(T)
  def sin : Num::Grad::Variable(T)
    result = @context.variable(@value.sin)

    if self.is_grad_needed
      gate = Num::Grad::SinGate.new(result)
      gate.cache(result, self)
    end
    result
  end
end

Testing the derivative of the sin function:

ctx = Num::Grad::Context(Tensor(Float64)).new
t = [0.0, Math::PI].to_tensor
a = ctx.variable(t)
b = a.sin
b.backprop
puts a.grad
# [1, 1]

Use libblas as the default BLAS libaray

In the top of file src/libs/cblas.cr it shows the default BLAS libaray is openblas.

{% if flag?(:openblas) %}
  @[Link("openblas")]
{% elsif flag?(:accelerate) %}
  @[Link(framework: "Accelerate")]
{% elsif flag?(:darwin) %}
  @[Link(framework: "Accelerate")]
{% else %}
  @[Link("openblas")] # => change to @[Link{"blas"}]
{% end %}

Could you change the default library to @[Link{"blas"}]? Or, add a flag to make the link to libblas?

It will be helpful for some system only libblas is installed (like crystal env of replit.com).

Upgrade to Crystal v0.36.x

Problem description

The latest version of alea is pulled in as a dependency by num.cr. alea's latest now requires Crystal v0.36.0 or greater). This forces num.cr users to move to v0.36.x, as far as I can tell.

It seems Crystal 0.36.1 complains about

the declaration of the abstract method abstract def cache in Num::Grad::Gate(T)
the fact that Indexable.range_to_index_and_count now can return nil

I've opened a reference PR to discuss the details of this: #64

0.4.2 Target additions for 0.4.3

Sorting (#39)
Arg(min/max)
Faster iteration (#38)
Matrix exponentials (#41)

`sum` and `mean` not defined for Num::Grad::Variable

I am continuing to work on toy models using num.cr while following some PyTorch examples. As I was setting up the forward calculations using a gradient variable, I needed sum and mean functions which are currently not implemented.

Documented tensor constructors missing in released num.cr

Code Sample, a copy-pastable example if possible

Tensor(Float64).linear_space(a,b,c)
Tensor(T)#range, etc...

Problem description

API docs here https://crystal-data.github.io/num.cr/Tensor.html
describe several constructor methods that are not available in any code that pulls the shard from the latest release which is confusing. Also range, etc. are missing. It's not clear to me if the methods have been intentionally removed, as they look like they were just added in #30 .

Num.arange only accepts Int32's

num.cr/src/tensor/creation.cr

Line 159 in 32a5d07

    
           def arange(start : Int32, stop : Int32, step : Number = 1, dtype : U.class = Int32) forall U

Shouldn't it be

def arange(start : Number, stop : Number, step : Number = 1)

I'm kinda new to crystal, so I wasn't super sure what forall U was.

Feature request: range-based tensor creation

Similar to numpy.arange or numpy.linspace, take a start (inclusive), end (exclusive), and delta:

puts Tensor.arange(3)
# [0, 1, 2]

puts Tensor.arange(3.0)
# [ 0.0,  1.0,  2.0]

puts Tensor.arange(3,7)
# [3, 4, 5, 6]

puts Tensor.arange(3,7,2)
# [3, 5]

puts Tensor.arange(10, 5, -2)
# [10, 8, 6]

puts Tensor.arange(15, 10, 1)
# []

puts Tensor.arange(3, 6, 0.5)
# [3.0, 3.5, 4.0, 4.5, 5.0, 5.5]

Edit: add empty-range edge case example.
Edit 2: add non-integer delta example

Tensor#matmul Error: undefined method 'buffer' for Tensor(Float32)

This code, taken partially from the readme:

a = [[1, 2], [3, 4]].to_tensor.astype(Float32)
b = [[3, 4], [5, 6]].to_tensor.astype(Float32)

puts a.matmul(b)

Throws Error: undefined method 'buffer' for Tensor(Float32) after expanding the blas macro.

Edit: From some quick testing, it looks like a swap of .storage instead of .buffer in tensor/linalg#matmul will fix this. Putting together a PR but let me know if this isn't the right way to fix

Unary negation missing

While building a toy model in Crystal using num.cr I ran into the following compile error.

 5 | nt = -t
          ^
Error: wrong number of arguments for 'Tensor(Int32, CPU(Int32))#-' (given 0, expected 1)

Overloads are:
 - Tensor(T, S)#-(other)

Version mismatch

Hi Chris. Awesome shard. Thanks for your amazing work in num.

Problem description

There is an error when running either shards install or shards update. As far as I see is that there's a version mismatch between the github version vs what it is on the shard.yml.

Thanks in advance for your support.

Any chance of GRU/LSTM layers ?

Code Sample, a copy-pastable example if possible

# Your code here

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution.]

API Design Suggestions

I have been mostly messing around with blas/lapack/gsl at this point just to test performance and ability to store and manipulate data. Before the library goes any further I want to nail down some better design for the library. Primarily this means a generic way to support multiple data types while only having to write code once.

My previous messy macro solution to this (there are examples in most core files):

https://github.com/crystal-data/bottle/blob/91e0f7473785dd9d1042ec74235eb15f05be5d57/src/core/vector/math.cr

CUDA support

It would be really cool if this could support CUDA as a backend

Feature Request: Sub-Tensor selection

In many scenarios it is helpful to have read+write slicing within a sub-matrix/tensor:

a = [[3, 4, 5, 6], [5, 6, 7, 8]].to_tensor
b = [[11, 12, 13]].to_tensor
a[1, 1..] = b

puts a # same as a[.., ..] or a[..., ...] etc.

# [[3, 4,  5, 6],
#  [5, 11, 12, 13]]

puts a[0..1, 1...3] # note inclusive vs. exclusive ranges

# [[4, 5],
#  [11, 12]]

Implement contiguous fast tracking.

For contiguous vectors and matrices, avoiding stride checks is much faster. Need to implement a generic way to fast track operations on contiguous arrays.

Abstract the tensor object

There should be a base implementation of an n-dimensional tensor that everything inherits from, to make implementing character arrays, structured arrays, and masks easier.

crystal-data / num.cr Goto Github PK

num.cr's People

Contributors

Stargazers

Watchers

Forkers

num.cr's Issues

Code Sample, a copy-pastable example if possible

Problem description

Problem description

Problem description

Target additions for 0.4.3

Code Sample, a copy-pastable example if possible

Problem description

Problem description

Code Sample, a copy-pastable example if possible

Problem description

Recommend Projects

Recommend Topics

Recommend Org