crystal-data / num.cr Goto Github PK
View Code? Open in Web Editor NEWScientific computing in pure Crystal
License: MIT License
Scientific computing in pure Crystal
License: MIT License
First link in bottle docs is to CrystalData.org which is not responding, possibly due to problem with DNS configuration or domain registration.
Right now, all types of Tensors are implemented separately, so ClTensors
and Tensor
s have entirely separate implementations. There are plenty of other backends I would like to support:
Ideally, these should not require multiple implementations, but rather be created as interfaces that can all be developed in a single class.
This would be a major overhaul to existing implementations, but I think Tensor(T)
should become Tensor(T, V)
, where T
is a base Crystal data type, and V
is a backend implementation.
This would also have to not break Num::Grad
and Num::NN
, which would mean that all gates would work across all backends, or at least raise unimplemented errors at compile time vs. runtime.
As the Num::NN module expands, especially to support many more complex layer types and network types, keeping the automatic differentiation up to date with each of these layers will make maintenance harder than I think it is worth. I'm already running into this with RNN, GRU, and LSTM layers, where it's much easier to not worry about deriving multiple hidden states. I think it is good to keep Num::Grad, to allow users to build their own networks, but also provide a much more flexible option.
Potentially while doing this, it might be worth it to make Num::Grad::Variable
a true wrapper around Tensor
, to make syntax more similar.
require "num"
include Num
alias RT64 = Tensor(Float64, CPU(Float64))
alias CT64 = Tensor(Complex, CPU(Complex))
complexMatrix = RT64.random(0.0..1.0, [3, 8])
realVector = RT64.random(0.0..1.0, [8 ])
complexMatrix *= (1 + 1.i) # Didn't get random working, so cheat here.
result = complexMatrix.matmul(realVector)
pp result
matmul
(and possibly other operations, not managed to check yet) only seems to be defined for operand matrices of the same data type. Multiplying a complex and a float is well-defined on an element-by-element basis, as would be multiplication of a real an an integer, so I reckon an overload should be provided for cases where the underlying element types are different but compatible.
I'm working on some toy models using num.cr
while following some PyTorch examples. As I was setting up the forward calculations using a gradient variable, I needed log
and tanh
functions which are currently not implemented.
Hello,
It seems there is a simple error when executing the following code:
# Your code here
require "num"
require "complex"
a1 = Complex.new(9,0)
a2 = Complex.new(6,0)
a3 = Complex.new(12,0)
a4 = Complex.new(8,0)
y = [[a1,a2], [a3,a4]]
x = y.to_tensor.as_type(Complex)
puts x
begin
puts x.inv
rescue ex
puts ex.message
end
puts "continued"
I get the following error:
In lib/num/src/linalg/work.cr:22:30
22 | def get_cmplx(n) : Slice(LibCBLAS::ComplexDouble)
^----------------------
Error: undefined constant LibCBLAS::ComplexDouble
Did you mean 'LibCblas'?
When I change the import to LibCblas
instead of LibCBLAS
it works as normal.
Thanks in advance
All gates should be able to be serialized and read back in to save trained models to disk.
Having fun with this package, but cannot figure out how to move Tensors onto GPU instead of CPU. When I call tensor.opencl -> it stays on CPU.
yes I have a functioning clblas.
Right now, there is no way to do the following index:
m = Matrix.new [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
m[[0, 1], 1...3]
# should return a matrix
# [[2.0, 3.0]
# [5.0, 6.0]]
This should be a COPY, not a view, since it can be non-contiguous.
Would it be a good idea to add in a [1, 2, 3].to_tensor
method for arrays? Maybe .to_vector
and .to_matrix
similar to this repository? It would be a lot easier and prettier than Tensor.from_array [1, 2, 3]
.
https://github.com/ruivieira/crystal-gsl/blob/fa8ea796f06c77add4bb092be7bbdd93794027d6/src/gsl/base/array.cr
It would be great to have the ability to convert a Tensor to JSON and vice versa. I know I can go the Tensor -> Array -> JSON route, but it should be pretty simple to add.
This is a bit of an odd one :) Here is some code to first create a 3-by-2, transpose it, then scale each column. However, the middle one gets swapped. This only happens when transpose is involved; when the 2-by-3 is created directly, things function correctly.
a = [[0,1], [0,1], [0,1]].to_tensor.astype(Float64)
puts a
# [[0, 1],
# [0, 1],
# [0, 1]]
a = a.transpose
puts a
# [[0, 0, 0],
# [1, 1, 1]]
b = Tensor.diag([1,2,3].to_tensor.astype(Float64))
puts b
# [[1, 0, 0],
# [0, 2, 0],
# [0, 0, 3]]
puts a.matmul(b)
# [[0, 2, 0],
# [1, 0, 3]]
## Expected output:
# [[0, 0, 0],
# [1, 2, 3]]
Again, this works correctly when a = [[0, 0, 0], [1, 1, 1]].to_tensor.astype(Float64)
.
Even using gsl
, Crystal beats some of the naive vector operations that don't drop to blas, but just use gsl
, due to the overhead of copying/access. I think it's worth it to just implement the array class in Crystal
Add in basic CSR, CSC and COO sparse matrices, backed by a sparse blas implementation.
WIP #45
It would be helpful to have gates for trig functions with simple derivatives, such as sin
, cos
, etc. The general pattern can be viewed in grad.arithmetic_ops
. A Gate must have a backward
method, and a cache method. So for example, the sin
, gate would look like:
class Num::Grad::SinGate(T) < Num::Grad::Gate(T)
getter a : Num::Grad::Variable(T)
def initialize(@a : Num::Grad::Variable(T))
end
def backward(payload : Num::Grad::Payload(T)) : Array(T)
gradient = payload.variable.grad
r0 = gradient.map(a.value) do |i, j|
i * Math.cos(j)
end
[r0]
end
def cache(result : Num::Grad::Variable(T), *args)
a = args[0]
result.grad = T.zeros_like(result.value)
result.requires_grad = true
Num::Grad.register("Sin", self, result, a)
end
end
After the gate is created, an operator should be added directly to Variable
, calling this function and cacheing it on a context:
class Num::Grad::Variable(T)
def sin : Num::Grad::Variable(T)
result = @context.variable(@value.sin)
if self.is_grad_needed
gate = Num::Grad::SinGate.new(result)
gate.cache(result, self)
end
result
end
end
Testing the derivative of the sin
function:
ctx = Num::Grad::Context(Tensor(Float64)).new
t = [0.0, Math::PI].to_tensor
a = ctx.variable(t)
b = a.sin
b.backprop
puts a.grad
# [1, 1]
In the top of file src/libs/cblas.cr
it shows the default BLAS libaray is openblas
.
{% if flag?(:openblas) %}
@[Link("openblas")]
{% elsif flag?(:accelerate) %}
@[Link(framework: "Accelerate")]
{% elsif flag?(:darwin) %}
@[Link(framework: "Accelerate")]
{% else %}
@[Link("openblas")] # => change to @[Link{"blas"}]
{% end %}
Could you change the default library to @[Link{"blas"}]
? Or, add a flag to make the link to libblas
?
It will be helpful for some system only libblas
is installed (like crystal env of replit.com).
The latest version of alea
is pulled in as a dependency by num.cr
. alea
's latest now requires Crystal v0.36.0 or greater). This forces num.cr
users to move to v0.36.x, as far as I can tell.
It seems Crystal 0.36.1 complains about
abstract def cache
in Num::Grad::Gate(T)
Indexable.range_to_index_and_count
now can return nil
I've opened a reference PR to discuss the details of this: #64
I am continuing to work on toy models using num.cr
while following some PyTorch examples. As I was setting up the forward calculations using a gradient variable, I needed sum
and mean
functions which are currently not implemented.
Tensor(Float64).linear_space(a,b,c)
Tensor(T)#range, etc...
API docs here https://crystal-data.github.io/num.cr/Tensor.html
describe several constructor methods that are not available in any code that pulls the shard from the latest release which is confusing. Also range
, etc. are missing. It's not clear to me if the methods have been intentionally removed, as they look like they were just added in #30 .
Line 159 in 32a5d07
Shouldn't it be
def arange(start : Number, stop : Number, step : Number = 1)
?
I'm kinda new to crystal, so I wasn't super sure what forall U
was.
Similar to numpy.arange or numpy.linspace, take a start (inclusive), end (exclusive), and delta:
puts Tensor.arange(3)
# [0, 1, 2]
puts Tensor.arange(3.0)
# [ 0.0, 1.0, 2.0]
puts Tensor.arange(3,7)
# [3, 4, 5, 6]
puts Tensor.arange(3,7,2)
# [3, 5]
puts Tensor.arange(10, 5, -2)
# [10, 8, 6]
puts Tensor.arange(15, 10, 1)
# []
puts Tensor.arange(3, 6, 0.5)
# [3.0, 3.5, 4.0, 4.5, 5.0, 5.5]
Edit: add empty-range edge case example.
Edit 2: add non-integer delta example
This code, taken partially from the readme:
a = [[1, 2], [3, 4]].to_tensor.astype(Float32)
b = [[3, 4], [5, 6]].to_tensor.astype(Float32)
puts a.matmul(b)
Throws Error: undefined method 'buffer' for Tensor(Float32)
after expanding the blas
macro.
Edit: From some quick testing, it looks like a swap of .storage
instead of .buffer
in tensor/linalg#matmul
will fix this. Putting together a PR but let me know if this isn't the right way to fix
While building a toy model in Crystal using num.cr
I ran into the following compile error.
5 | nt = -t
^
Error: wrong number of arguments for 'Tensor(Int32, CPU(Int32))#-' (given 0, expected 1)
Overloads are:
- Tensor(T, S)#-(other)
# Your code here
[this should explain why the current behaviour is a problem and why the expected output is a better solution.]
I have been mostly messing around with blas/lapack/gsl at this point just to test performance and ability to store and manipulate data. Before the library goes any further I want to nail down some better design for the library. Primarily this means a generic way to support multiple data types while only having to write code once.
My previous messy macro solution to this (there are examples in most core files):
It would be really cool if this could support CUDA as a backend
In many scenarios it is helpful to have read+write slicing within a sub-matrix/tensor:
a = [[3, 4, 5, 6], [5, 6, 7, 8]].to_tensor
b = [[11, 12, 13]].to_tensor
a[1, 1..] = b
puts a # same as a[.., ..] or a[..., ...] etc.
# [[3, 4, 5, 6],
# [5, 11, 12, 13]]
puts a[0..1, 1...3] # note inclusive vs. exclusive ranges
# [[4, 5],
# [11, 12]]
For contiguous vectors and matrices, avoiding stride checks is much faster. Need to implement a generic way to fast track operations on contiguous arrays.
There should be a base implementation of an n-dimensional tensor that everything inherits from, to make implementing character arrays, structured arrays, and masks easier.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.