Comments (8)
I am working on this but I am currently stuck. I was planning to use System.Numerics.DenseTensor<T>
but I was not able to make it work. The main problem being:
public class DenseTensor<T> : Tensor<T>
{
private readonly Memory<T> memory;
...
Torch tensors are backed by their own storage abstraction (eventually mapping to GPU) which does not match with Memory<T>
(or at leas I was not able to make it work). The only option for the moment looks that is having our own TorchTensor<T>
inheriting directly from Tensor<T>
and backed by Torch storage.
from torchsharp.
@ericstj, can you advise?
from torchsharp.
DenseTensor is meant to be the implementation where we can agree upon a backing buffer that can be directly accessed by the managed code. Memory is the abstraction for access.
Memory can represent a raw pointer by implementing the OwnedMemory, here's an old sample. I believe OwnedMemory underwent some renaming, so its now called MemoryManager. /cc @ahsonkhan / @GrabYourPitchforks
Do torch Tensors expose a raw buffer that we can wrap (like TF_TensorData) , or do they require access through some interface? If the latter then I'd agree that we can't handle those in DenseTensor today. We could, but we'd have to look hard at the access characteristics to understand if it made more sense to access through interop vs copy.
from torchsharp.
I think it is latter: although they do provide a way to retrieve the data pointer, they also provide methods to manipulate the data (get, set, etc.). In my opinion this means that accessing directly the data can be dangerous since we don't know how they actually store the data (I have to look into this).
from torchsharp.
I did a comparison between the Tensor API and TorchTensor (TH). DISCLAIMER: I haven't actually run the code to compare the methods but only looked at docs. Below are my notes.
protected Tensor(int length); # this should be long
protected Tensor(ReadOnlySpan<int> dimensions, bool reverseStride); # No ReadOnlySpan in TH nor reverseStride. Instead of ReadOnlySpan they use explicit constructors with different number of parameters, up to 4 dimensions
protected Tensor(Array fromArray, bool reverseStride); # I think this can be obtained using newWithDataAndAllocator
public virtual T this[ReadOnlySpan<int> indices] { get; set; } # Not available in TH
public virtual T this[params int[] indices] { get; set; } # No array but different version of the same methods with different numbers of input parameters
public ReadOnlySpan<int> Strides { get; } # In TH you can only ask for the stride of a dimension, not all strides
public ReadOnlySpan<int> Dimensions { get; } # In TH is called Shape, which I think is more appropriate. In addition, you can only ask for the size of a dimension
public bool IsReversedStride { get; } # This confuses me. I actually think that the documentation is wrong "False (default) to indicate that the first dimension is most major (farthest apart) and the last dimension is most minor (closest together): akin to row-major in a rank-2 tensor." But row major is when the first dimension is closest https://en.wikipedia.org/wiki/Row-_and_column-major_order
public int Rank { get; } # This is nDimension in TH.
public long Length { get; } # This is nElement in TH
public bool IsFixedSize { get; } # no in TH
public bool IsReadOnly { get; } # no in TH
public static int Compare(Tensor<T> left, Tensor<T> right); # In TH there are lt or gt and returns a tensor not an int. I think that this can be implemented using a couple of TH methods
public static bool Equals(Tensor<T> left, Tensor<T> right); # equal in TH
public abstract Tensor<T> Clone(); # newClone in TH
public abstract Tensor<TResult> CloneEmpty<TResult>(ReadOnlySpan<int> dimensions); # No in TH
public virtual Tensor<TResult> CloneEmpty<TResult>(); # No in TH
public virtual Tensor<T> CloneEmpty(ReadOnlySpan<int> dimensions); # No in TH
public virtual Tensor<T> CloneEmpty(); # No in TH
public virtual void Fill(T value); # Same
public string GetArrayString(bool includeWhitespace = true); # No in TH
public Tensor<T> GetDiagonal(); # I think this is diag(0) in TH
public Tensor<T> GetDiagonal(int offset); # I think this is diag in TH
public Tensor<T> GetTriangle(); # tril in TH
public Tensor<T> GetTriangle(int offset); # tril in TH
public Tensor<T> GetUpperTriangle(); #triu in TH
public Tensor<T> GetUpperTriangle(int offset); # triu in TH
public abstract T GetValue(int index); # this is get1d, get2d, etc. in TH, i.e., there is no generic get with 1 index unless 1-d tensor
public Tensor<T> MatrixMultiply(Tensor<T> right); # This is Gemm over 2d tensors
public abstract Tensor<T> Reshape(ReadOnlySpan<int> dimensions); # In TH this might be called Resize1d, Resize2d etc. I need to check however because it could also be newWithSize1d, newWithSize2d, etc.
public abstract void SetValue(int index, T value); # this is set1d, set2d, etc
public Tensor<T> Slice(params Range[] ranges); # I think it is narrow, but TH can do by dimensions and 1 dimension at a time
public virtual Tensor<T> Slice(ReadOnlySpan<Range> ranges); # same as above
public virtual CompressedSparseTensor<T> ToCompressedSparseTensor(); # this is specific to Tensor<T>
public virtual DenseTensor<T> ToDenseTensor(); # this is specific to Tensor<T>
public virtual SparseTensor<T> ToSparseTensor(); # this is specific to Tensor<T>
protected virtual bool Contains(T item); # not available in TH
protected virtual void CopyTo(T[] array, int arrayIndex); # I am not finding this in TH, probably you can do this by playing with storage
protected virtual int IndexOf(T item); # not available in TH
# The following methods in TH have the English names and not the operator e.g., Add, lShift, etc
public static Tensor<T> operator +(Tensor<T> tensor);
public static Tensor<T> operator +(Tensor<T> tensor, T scalar);
public static Tensor<T> operator +(Tensor<T> left, Tensor<T> right);
public static Tensor<T> operator -(Tensor<T> tensor);
public static Tensor<T> operator -(Tensor<T> left, Tensor<T> right);
public static Tensor<T> operator -(Tensor<T> tensor, T scalar);
public static Tensor<T> operator ++(Tensor<T> tensor);
public static Tensor<T> operator --(Tensor<T> tensor);
public static Tensor<T> operator *(Tensor<T> tensor, T scalar);
public static Tensor<T> operator *(Tensor<T> left, Tensor<T> right);
public static Tensor<T> operator /(Tensor<T> tensor, T scalar);
public static Tensor<T> operator /(Tensor<T> left, Tensor<T> right);
public static Tensor<T> operator %(Tensor<T> tensor, T scalar);
public static Tensor<T> operator %(Tensor<T> left, Tensor<T> right);
public static Tensor<T> operator &(Tensor<T> left, Tensor<T> right);
public static Tensor<T> operator &(Tensor<T> tensor, T scalar);
public static Tensor<T> operator |(Tensor<T> tensor, T scalar);
public static Tensor<T> operator |(Tensor<T> left, Tensor<T> right);
public static Tensor<T> operator ^(Tensor<T> tensor, T scalar);
public static Tensor<T> operator ^(Tensor<T> left, Tensor<T> right);
public static Tensor<T> operator <<(Tensor<T> tensor, int value);
public static Tensor<T> operator >>(Tensor<T> tensor, int value);
from torchsharp.
Note that to subclass Tensor you really only need to implement few methods. At a minimum the abstracts, but maybe a few more if you have more efficient implementations. See https://github.com/dotnet/corefxlab/blob/master/src/System.Numerics.Tensors/System/Numerics/Tensors/SparseTensor.cs for an example.
Most of those arithmetic operators aren't staying in for V1, we plan to remove them. dotnet/corefxlab#1798
It's probably not a great idea to sub-type Tensor and use interop for element access. I can imagine that giving very bad performance for large size tensors. Probably better to just copy on out if the only way to get data out is through element access...
I took a quick look at torch API and wanted to point out a few things:
https://pytorch.org/docs/stable/torch.html#torch.from_numpy
Creates a Tensor from a numpy.ndarray.
The returned tensor and ndarray share the same memory. Modifications to the tensor will be reflected in the ndarray and vice versa. The returned tensor is not resizable.
This indicates that torch does support wrapping a shared buffer, so at least going into torch should be possible with DenseTensor (which has compatible layout with ndarray).
https://pytorch.org/docs/stable/tensors.html?highlight=numpy#torch.Tensor.numpy
Returns self tensor as a NumPy ndarray. This tensor and the returned ndarray share the same underlying storage. Changes to self tensor will be reflected in the ndarray and vice versa.
This indicates that torch does support exposing its tensor as a shared buffer, so data coming out of torch should also be possible with DenseTensor (if we wrap that shared buffer in a Memory).
/cc @tannergooding
from torchsharp.
The problem that I see with targeting DenseTensor is that we are limiting ourselves in supporting dense tensors living on CPU memory only. This may or may not be a limitation, of course.
The idea of having indirect access to tensors through interop is that we don't have to make any assumption on where the tensor live or its representation.
Regarding performance: random access might be bad, but if we use PyTorch native operations they will be probably be better then SNT because they can make use of BLAS, GPU, MKL, etc. Clearly there is a problem if we want to do an operation between a tensor living in Torch and one living in SNT, but I see this as a different issue.
from torchsharp.
The problem that I see with targeting DenseTensor is that we are limiting ourselves in supporting dense tensors living on CPU memory only. This may or may not be a limitation, of course.
It seems to me that we are arriving at a structure not unlike numpy / torch in Python: We can convert SNT to TorchSharp Tensors and back with zero data copies, as long as they are dense and on the CPU. If not, there is a copy involved in such a conversion. The conversion can be exposed via extension methods on SNT and the TorchSharp Tensor, e.g. ToDenseTensor()
and ToTorchTensor
.
Given that, and
Most of those arithmetic operators aren't staying in for V1, we plan to remove them.
We could leave the integration between TorchSharp and SNT at that conversion, right? All arithmetic operations would be defined on the TorchSharp types anyway.
Does this make sense?
from torchsharp.
Related Issues (20)
- Add register_pre_hook and register_hook for generic module class? HOT 3
- How do I turn off these warnings? HOT 17
- `Module.to(Device)` will cause warnings because non-leaf tensors `.grad` is accessed (Also for optimizers) HOT 6
- Linear module creates parameter on cpu even after I set default device to meta HOT 1
- Request tensor factory methods accepting Span<T> HOT 9
- The function torch.cuda.empty_cache() is missing. HOT 6
- Move to libtorch 2.4.0 HOT 1
- Unexpected parameters HOT 11
- MultivariateNormal.log_prob() exception in TorchSharp but works in pytorch HOT 16
- Builds fail after package validation PR merged HOT 2
- Modernize torchsharp infrastructure
- Jagged Array to Tensor HOT 5
- Arrays of Arrays HOT 4
- IterableDataset
- Build failing after merge of PR 1354
- Suppress the API breakchange introduced by PR 1354 and bump up minor version HOT 3
- `torch.nn.functional.l1_loss` computes a criterion with the MSE, not the MAE HOT 6
- MissingMethodException: Method not found: '!!0 TorchSharp.ModuleExtensionMethods.cuda(!!0, Int32)' HOT 8
- Why this dotnet pack error?
- GELU does not appear to support approximate tanh HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from torchsharp.