mmuckley / torchkbnufft Goto Github PK
View Code? Open in Web Editor NEWA high-level, easy-to-deploy non-uniform Fast Fourier Transform in PyTorch.
Home Page: https://torchkbnufft.readthedocs.io/
License: MIT License
A high-level, easy-to-deploy non-uniform Fast Fourier Transform in PyTorch.
Home Page: https://torchkbnufft.readthedocs.io/
License: MIT License
When I use self.nufft_ob = tkbn.KbNufft(im_size=im_size, #grid_size=(256,256), ).to(self.dev)
,
I got warning and shutdown
DeprecationWarning: `np.complex` is a deprecated alias for the builtin `complex`. To silence this warning, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
but there is no problem with tkbn.ToepNufft()
and I want to know how to inverse NuFFT with toepliz.
Thank you
Thank you very much for putting torchkbnufft
together. I like the fact that it supports autograd
. My question is how does one take an inverse non-uniform Fourier Transform with torchkbnufft
? Thanks!
Hi Matthew,
Thanks for such an elegant implementation for NUFFT on Pytorch.
I have on little question: when choosing 'ortho' norm and the grid_size is not exactly 2*im_size, the scale of ToepNUFFT is inconsistent with AdjKbNufft(KbNufft()). This also happens before 1.0.0.
Thanks in advance.
Not necessary to ask this. Close the issue.
Hi,
thanks for this package :)!
However, I was wondering about the performance. I would have expected maybe x10 in difference to a standard FFT (from my past experience with CPU based NFFTs.). But here, I observe a difference in factor x100. The adjoint is almost a factor of x1000 slower.
Any thoughts on this?
Felix
Code:
import torch
import torchkbnufft as tkbn
import numpy as np
from skimage.data import shepp_logan_phantom
import matplotlib.pyplot as plt
import napari
import glob
import imageio.v3 as iio
import os
from tqdm import tqdm
device = 'cuda' if torch.cuda.is_available() else 'cpu'
N = 512
N_z = 100
N_angles = int(np.ceil(np.pi * N / 2))
voxels = (torch.zeros(1, N_z, N, N, dtype=torch.float32) + 1j * 0).to(device).to(torch.complex64)
def prepare_nufft(arr, N_angles):
N = arr.shape[-1]
k_1d = np.reshape(np.linspace(-0.5, 0.5, N + 1)[:-1], (-1, 1))
angles = np.reshape(np.linspace(0, np.pi, N_angles + 1)[:-1], (1, -1))
k_y = np.cos(angles) * k_1d
k_x = np.sin(angles) * k_1d
ks = np.reshape(np.stack((k_y, k_x)), (2, -1))
ktraj = torch.tensor(ks, dtype=torch.float32).to(device)
nufft = tkbn.KbNufft(im_size=(N, N), numpoints=6).to(device)
adjnufft = tkbn.KbNufftAdjoint(im_size=(N, N), numpoints=6).to(device)
return nufft, adjnufft, ktraj
nufft, adjnufft, ktraj = prepare_nufft(voxels, N_angles)
%%time
kdata = nufft(voxels, ktraj)
CPU times: user 30.1 ms, sys: 0 ns, total: 30.1 ms
Wall time: 29.6 ms
%%time
#kdata = nufft(voxels, ktraj)
img_filter = adjnufft(kdata, ktraj)
#torch.cuda.synchronize()
CPU times: user 268 ms, sys: 172 ms, total: 440 ms
Wall time: 394 ms
%%time
torch.fft.fft(voxels)
CPU times: user 337 µs, sys: 180 µs, total: 517 µs
Wall time: 283 µs
Hi,
I am interested in the way you implemented density compensation.
On this line, you use [-2:, ...]
on a tensor that is of shape [2, 324000]
. Therefore I don't understand the use of slicing in this instance.
The reason I want to know more is because I want to use density compensation in 3D, and I think that in the current state, this slicing is preventing it. Other than that, do you see any reason why the current density compensation wouldn't work well for 3D trajectories?
If I read
https://en.wikipedia.org/wiki/Non-uniform_discrete_Fourier_transform#Definition
correctly, then "adjoint nonuniform DFT" (also called "nonuniform discrete Fourier transform of type II") would entail interpolating in space.
Thus, I expect that KbNufftAdjoint
would not operate in frequency space, as seems to be suggested by the docs. Is there a standard reduction in this field that allows me to interpolate in space using this logic please?
Hello,
I encountered this problem on our computation server, it has a dual socket setup with two cpus each having 64 cores (together 128 cores and 256 threads). I have a dynamic dataset with radial sampling where are 100 frames each having 37 spokes (different trajectories). I wanted to compute the density compensation in a batched way, so that it is faster. When run in the batched form I get errors failing obtaining resources from the libgomp:
libgomp: Thread creation failed: Resource temporarily unavailable
This does not differ if I run the compensation on a CPU (the trajectories are on a CPU device) or they are on a GPU (CUDA).
On the other hand when I run this density compensation in a for loop (each 2D frame at a time), I get the results without an issue.
Also when working with the forward/adjoint operators later on they work in a batched form just fine(although quite slow). This problem is related only to the density compensation function. Might this be an issue with thread allocation specifically in this function?
Unfortunately I am able to reproduce this problem only at this specific setup, I tried it on a different PC and there the batch computation works just fine but quite slow even on a GPU. When I looked at the GPU usage it was very low, while the CPU usage was quite high. I first thought that there is some precomputation going on when the trajectories are not the same for each frame, although it might be related to the fact that it is creating too many threads in that case. I am not that familiar with this implementation. I am used to planning the NUFFT as in for example the gpuNUFFT or pynufft and then applying the transforms and here it seems different, as no planning step is taken ahead of applying the forward/adjoint operations.
Hi,
I know it's bold claim, but I think there is an issue with the autograd functionality in the forward nufft operation. In order to reproduce the problem I modified the cell [8] in the Basic Example in the following way:
image.requires_grad = True
kdata = nufft_ob(image, ktraj)
kdata.requires_grad
The first time I executed the cell, the output is True, as expected. But every subsequent time, it prints False. I am not an expert in the torch autograd functionality but I would expect the output to have the same requires_grad value as the the input.
I also did the same test with the adjoint nufft (cell [10] modified):
kdata.requires_grad = True
image_blurry = adjnufft_ob(kdata, ktraj)
print(image_blurry.requires_grad)
... and the output was always True. The same holds for the examples in v0.3.4.
Could it be that the nufft_ob's internal state changes with every call?
Thanks a lot!
Hi,
I am a bit confused about how the ktraj is defined.
In the examples you provide, you use a radial k-space trajectory, and it's shape is (2, 2 x nspokes x image_size).
How can I define a simple Cartesian trajectory?
Thanks!
I made a change in #27 that could be a potential issue, so I would like to document it here. Particularly, in this commit the following lines were commented out:
if USING_OMP and cpu_device:
torch.set_num_threads(num_threads)
The reason for this was that in PyTorch 1.8 these lines led to a severe performance regression - i.e., at the Python level it seemed PyTorch wasn't handling switching the number of available threads very well. I removed the lines as the regression was too large.
The downside is that shell-based OMP thread management may be ignored within forks - OMP specifies the number of threads that can be spawned but does not keep a global limit, so if your global limit is 8 and you fork 8 times, each one of those forks could create 8 new threads and lead to oversubscription.
In general it's a niche issue that I hope doesn't affect most people. I haven't been able to figure out how to fix this - the answer may be to just go to C++ as mentioned in Issue #28 if anyone decides to tackle this.
Hello,torchkbnufft is great but the example is not easy.So,I want to know if I want to compute the Fourier transform of a two-dimensional array by use torchkbnufft,what should I do?
example:a = torch.rand((4,4))
fft: b = torch.fft.fft2(a)
torchkbnufft:???
I am noticing the performance degradation with PyTorch 1.8 for the CPU on my home system (Windows 10, i5 8400, GTX 1660, torchkbnufft
version 1.1.0). It looks like the GPU is relatively unaffected. The details are below. I'm not sure why this is happening yet, but I will try to look into it. If anyone has any information, feel free to post on this issue.
PyTorch 1.8:
running profiler...
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cpu, sparse_mats: False, toep_mat: False, size_3d: None
forward average time: 2.0657340599999996, backward average time: 3.4234444799999992
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cpu, sparse_mats: False, toep_mat: True, size_3d: None
toeplitz forward/backward average time: 0.13343995500000005
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cpu, sparse_mats: True, toep_mat: False, size_3d: None
forward average time: 1.0262545000000016, backward average time: 1.0705226799999992
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cuda, sparse_mats: False, toep_mat: False, size_3d: None
GPU forward max memory: 0.159003136 GB, forward average time: 0.074529785, GPU adjoint max memory: 0.152530432 GB, backward average time: 0.0685140699999998
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cuda, sparse_mats: False, toep_mat: True, size_3d: None
GPU forward max memory: 0.114505216 GB, toeplitz forward/backward average time: 0.006467924000000096
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cuda, sparse_mats: True, toep_mat: False, size_3d: None
GPU forward max memory: 0.77268992 GB, forward average time: 0.20692490499999963, GPU adjoint max memory: 1.035167232 GB, backward average time: 0.2132972450000004
PyTorch 1.7.1:
running profiler...
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cpu, sparse_mats: False, toep_mat: False, size_3d: None
forward average time: 1.8955573599999997, backward average time: 1.6387825
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cpu, sparse_mats: False, toep_mat: True, size_3d: None
toeplitz forward/backward average time: 0.12237997000000007
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cpu, sparse_mats: True, toep_mat: False, size_3d: None
forward average time: 0.8352743000000004, backward average time: 1.01682184
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cuda, sparse_mats: False, toep_mat: False, size_3d: None
GPU forward max memory: 0.158736896 GB, forward average time: 0.07951689000000002, GPU adjoint max memory: 0.152530432 GB, backward average time: 0.06854967499999987
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cuda, sparse_mats: False, toep_mat: True, size_3d: None
GPU forward max memory: 0.114505216 GB, toeplitz forward/backward average time: 0.006591889999999978
im_size: (256, 256), spokelength: 512, num spokes: 405, ncoil: 15, batch_size: 1, device: cuda, sparse_mats: True, toep_mat: False, size_3d: None
GPU forward max memory: 0.77268992 GB, forward average time: 0.2121914199999999, GPU adjoint max memory: 1.035167232 GB, backward average time: 0.21654677000000006
Thank you for sophisticated implementation of NUFFT with pytorch.
I have a question about "Basic KB-NUFFT Example".
In the code, I compared the original image of the shepp-logan and the final product of tkbn.KbNufftAdjoint, namely "sharp image (with Pipe dcomp)" in the last image,
and found that the accuracy is in order of ~10%.
Is this an expected result (since its purpose is demonstration)?
And if so, could you kindly teach me how to improve it to moderate accuracy like ~0.1%?
My environment is:
MacOS 11.3.1
python 3.8.3
torch 1.10.2
torchvision 0.11.3
jupyter 6.0.3
The omega is given by np.mgrid as on-grid locaiton 2 * pi * n / N
data = torch.randn(1, 1, 8, 8) + 1j * torch.randn(1, 1, 8, 8)
[x, y] = np.mgrid[-4:4, -4:4].astype(np.float32) * np.pi / 8
omega = torch.stack([torch.tensor(x), torch.tensor(y)], dim=0)
adjkb_ob = tkbn.KbInterpAdjoint(im_size=(8, 8), grid_size=(8, 8))
image = adjkb_ob(data.view(1, 1, -1), omega.view(2, -1))
plt.figure()
plt.imshow(data.squeeze().abs())
plt.figure()
plt.imshow(image.squeeze().abs())
plt.show()
The left one is the original data and the right is after interpolation. They are quite different !
I wonder whether the on-grid location after interpolation is 2 * pi * n / N ?
Hello, i have several 1D time series data that is non-uniformly sampled. How would i apply this case in your package? Thank you very much for the help in advance :)
Hi @mmuckley, thank you for the wonderful work. I'm using the KbNufftAdjoint for spiral data and am wondering how do I disable the oversampling during gridding? I tried to set the grid_size the same as the im_size to avoid it but it doesn't seem to work. The final result still looks smaller than it should be. I tried various settings with no success....
Hi,
I'm trying to perform on-the-fly data undersampling in my PyTorch dataset. To do this, I perform a Toeplitz NUFFT in the __getitem__
function of my Dataset
class. This works as expected. Now, I want to to batching, so I wrap the PyTorch Dataset
in a PyTorch DataLoader
. This works as expected when num_workers=0
. However, when num_workers
is non-zero, computation of the NUFFT seemingly enters an infinite loop.
Performing a NUFFT in parallel using multiple workers should result in undersampled images.
Sampling the dataloader results in a hanging script, seemingly entering an infinite loop.
import torch
from skimage.data import shepp_logan_phantom
from skimage.transform import rescale
import numpy as np
import torchkbnufft as tkbn
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
NUM_WORKERS=0 # set to > 0 to hang
class DataUndersampler(Dataset):
def __init__(self, undersampling_factor, spatial_scaling_factor):
self.image = shepp_logan_phantom()
self.image_shape = self.image.shape
self.undersampling_factor = undersampling_factor
self.spatial_scaling_factor = spatial_scaling_factor
# Need the original size to determine the undersampling factor and NUFFT operators
self.orig_size = self.image_shape
self.image_shape = (self.image_shape[0] // self.spatial_scaling_factor, self.image_shape[1] // self.spatial_scaling_factor)
# Create an oversampled grid
spokelength = self.image_shape[0] * 2
self.grid_size = (spokelength, spokelength)
# Generate A LOT of spokes, pick a starting point at random
nspokes = 2000
# Sample enough spokes to achieve undersampling factor
self.spokes_to_sample = int((self.orig_size[0] * np.pi / 2) / self.undersampling_factor)
# Generate a golden angle radial trajectory
ga = np.deg2rad(180 / ((1 + np.sqrt(5)) / 2))
kx = np.zeros(shape=(spokelength, nspokes))
ky = np.zeros(shape=(spokelength, nspokes))
ky[:, 0] = np.linspace(-np.pi, np.pi, spokelength)
for i in range(1, nspokes):
kx[:, i] = np.cos(ga) * kx[:, i - 1] - np.sin(ga) * ky[:, i - 1]
ky[:, i] = np.sin(ga) * kx[:, i - 1] + np.cos(ga) * ky[:, i - 1]
self.ky = np.transpose(ky)
self.kx = np.transpose(kx)
# 1D Ramlak. Needed for density compensation. Density is a linear function
# depending on the distance of the center of k-space...
ram_lak = np.abs(np.linspace(-1, 1, spokelength + 1))
ram_lak = ram_lak[:-1]
#... except for the center, we know exactly how often we sample that,
# namely as many times as the number of spokes
middle_idx = len(ram_lak) // 2
ram_lak[middle_idx] = 1/(2 * self.spokes_to_sample)
self.ram_lak = ram_lak
def __len__(self):
return 1
def __getitem__(self, index):
if self.spatial_scaling_factor > 1:
img = rescale(self.image, 1/self.spatial_scaling_factor).astype(np.complex)
else:
img = self.image.astype(np.complex)
if self.undersampling_factor > 1:
img_tensor = torch.from_numpy(img).unsqueeze(0).unsqueeze(0)
toep_ob = tkbn.ToepNufft()
# We pick a random starting spoke
offset = np.random.choice(range(self.ky.shape[0]-self.spokes_to_sample))
# And select as many subsequent spokes to reach the desired undersampling factor
# todo: use continuing trajectories for cine?
selected_ky = self.ky[offset:offset+self.spokes_to_sample].flatten()
selected_kx = self.kx[offset:offset+self.spokes_to_sample].flatten()
ktraj = torch.tensor(np.stack((selected_ky, selected_kx), axis=0))
# Repeat and reshape the ram-lak so every spoke is density-compensated
ram_lak_t = torch.from_numpy(np.tile(self.ram_lak, self.spokes_to_sample)).unsqueeze(0).unsqueeze(0)
# Calculate the really efficient Toeplitz kernel to compute the NUFFT
dcomp_kernel = tkbn.calc_toeplitz_kernel(ktraj, self.image_shape, weights=ram_lak_t, norm='ortho',numpoints=(3,3)) # with density compensation
# And in a single step, compute the radial k-space and back to image space
img = toep_ob(img_tensor, dcomp_kernel, norm='ortho').abs().squeeze().numpy()
# renormalize the output, because undersampling can change this.
img /= np.max(img)
return np.abs(img)
# Create dataset
dset = DataUndersampler(
undersampling_factor=2,
spatial_scaling_factor=1)
# this works fine
undersampled_img = dset[0]
# From here, the observed behaviour emerges.
dloader = DataLoader(dset,
shuffle=False,
batch_size=1, num_workers=NUM_WORKERS)
# this statement hangs when num_workers > 0
undersampled_img = next(iter(dloader))
Can the current package handle 2x oversampling data along readout direction? For example,if readout point of each spoke is 512(2x oversampling included),image size is 256,grid size is 384。
Hi,
I'm not quite familiar with the mathematics of the kernel but it looks like that the kernel has some boundary issues.
I would have expected that it decreases with intensities but it doesn't.
This is for the use-case of radial polar patterns.
What you see is the real part of the kernel, after fftshift.
Did you observe that before?
Best,
Felix
For example, in the provided code example Toeplitz Example,you can find that:
After applying the nufft operation, the range of the data has changed a lot. Although it is possible to force the unification of the data range through data normalization, how to normalize itself is still debatable.
x != A'Ax
Hi,
When I try to import the lib, there raises an error: AttributeError: module 'torch' has no attribute 'complex32'. How can I solve the issue?
Thanks!
Hi,
My problem is quite specific but I try my best to describe it here:
I have two loss functions - let's call them loss1 and loss2. loss1 is computed from an input image x, applying nufft operator on it and then computing loss1, i.e. f(x) := loss1(nufft(x)) (more operations are involved but they are irrelevant here).
Subsequently, I call torch.autograd.grad on f to compute loss2, i.e. f'(x) is part of the backpropagation graph for loss2. However, when I try to execute my code, sometimes, I get the following error message in the line where the autograd.grad function is called:
Frankly, the setup is quite complicated but it works fine when I remove the nufft operator from the function f. When the above error doesn't appear (usually, when I restart the kernel in Jupyter), the code breaks only at the final point when backward() on loss2 is called with an even more cryptic message:
Any suggestions what the underlying problem might be? Happy to provide more information if needed.
Thanks and best,
ajlok3
Since some features of pytorch are not yet supported for ComplexFloat tensors, it would be desirable to have a "switch" to turn off complex tensor completely (maybe in a different branch?).
My particular use case involves multi-GPU training with nccl backend. The error message I get is:
RuntimeError: Input tensor data type is not supported for NCCL process group: ComplexFloat
... and the complex tensors don't seem to come from my code but from within KbNUFFT object within the model.
Hi there,
I tried to use the torchkbnufft for 3D non-Cartesian trajectory for simulation. I used a 1x1x64x64x64 image and 3D radial trajectory as input to kbNufft for forward NUFFT and then use the acquired kdata for adjoint NUFFT, the output image doesn't look right. I'm wondering if torchkbnufft can be used for 3D non-Cartesian reconstruction? If so, are there any examples for 3D non-Cartesian reconstruction using this package?
Yours,
Qijia
When using batched inputs with varying ktraj sizes, I get a TorchScript RuntimeError when using a cuda GPU. On CPU, the error does not occur. See following example:
import torch
import torchkbnufft as tkbn
torch.manual_seed(0)
device = torch.device("cuda:0")
batch_size = 16
x = torch.rand((256, 256))
im_size = x.shape
x = x.unsqueeze(0).unsqueeze(0).to(torch.complex64)
x = x.repeat(batch_size, 1, 1, 1).to(device)
for _ in range(10):
klength = 32 + torch.randint(0, 128, size=(1,)).item()
print(klength)
ktraj = torch.stack(
(torch.zeros(klength), torch.linspace(-torch.pi, torch.pi, klength))
)
ktraj = ktraj.to(device)
ktraj = ktraj.unsqueeze(0).repeat(batch_size, 1, 1)
nufft_ob = tkbn.KbNufft(im_size=im_size).to(device)
kdata = nufft_ob(x, ktraj)
Output for me is
104
138
88
61
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[2], line 21
18 ktraj = ktraj.unsqueeze(0).repeat(batch_size, 1, 1)
20 nufft_ob = tkbn.KbNufft(im_size=im_size).to(device)
---> 21 kdata = nufft_ob(x, ktraj)
File ~/miniforge3/envs/py310/lib/python3.10/site-packages/torch/nn/modules/module.py:1110, in Module._call_impl(self, *input, **kwargs)
1106 # If we don't have any hooks, we want to skip the rest of the logic in
1107 # this function, and just call forward.
1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1109 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
File ~/miniforge3/envs/py310/lib/python3.10/site-packages/torchkbnufft/modules/kbnufft.py:211, in KbNufft.forward(self, image, omega, interp_mats, smaps, norm)
208 assert isinstance(self.table_oversamp, Tensor)
209 assert isinstance(self.offsets, Tensor)
--> 211 output = tkbnF.kb_table_nufft(
212 image=image,
213 scaling_coef=self.scaling_coef,
214 im_size=self.im_size,
215 grid_size=self.grid_size,
216 omega=omega,
217 tables=tables,
218 n_shift=self.n_shift,
219 numpoints=self.numpoints,
220 table_oversamp=self.table_oversamp,
221 offsets=self.offsets.to(torch.long),
222 norm=norm,
223 )
225 if not is_complex:
226 output = torch.view_as_real(output)
File ~/miniforge3/envs/py310/lib/python3.10/site-packages/torchkbnufft/functional/nufft.py:172, in kb_table_nufft(image, scaling_coef, im_size, grid_size, omega, tables, n_shift, numpoints, table_oversamp, offsets, norm)
169 is_complex = False
170 image = torch.view_as_complex(image)
--> 172 data = kb_table_interp(
173 image=fft_and_scale(
174 image=image,
175 scaling_coef=scaling_coef,
176 im_size=im_size,
177 grid_size=grid_size,
178 norm=norm,
179 ),
180 omega=omega,
181 tables=tables,
182 n_shift=n_shift,
183 numpoints=numpoints,
184 table_oversamp=table_oversamp,
185 offsets=offsets,
186 )
188 if is_complex is False:
189 data = torch.view_as_real(data)
File ~/miniforge3/envs/py310/lib/python3.10/site-packages/torchkbnufft/functional/interp.py:117, in kb_table_interp(image, omega, tables, n_shift, numpoints, table_oversamp, offsets)
114 is_complex = False
115 image = torch.view_as_complex(image)
--> 117 data = KbTableInterpForward.apply(
118 image, omega, tables, n_shift, numpoints, table_oversamp, offsets
119 )
121 if is_complex is False:
122 data = torch.view_as_real(data)
File ~/miniforge3/envs/py310/lib/python3.10/site-packages/torchkbnufft/_autograd/interp.py:87, in KbTableInterpForward.forward(ctx, image, omega, tables, n_shift, numpoints, table_oversamp, offsets)
81 """Apply table interpolation.
82
83 This is a wrapper for for PyTorch autograd.
84 """
85 grid_size = torch.tensor(image.shape[2:], device=image.device)
---> 87 output = table_interp(
88 image=image,
89 omega=omega,
90 tables=tables,
91 n_shift=n_shift,
92 numpoints=numpoints,
93 table_oversamp=table_oversamp,
94 offsets=offsets,
95 )
97 ctx.save_for_backward(
98 omega, n_shift, numpoints, table_oversamp, offsets, grid_size, *tables
99 )
101 return output
File ~/miniforge3/envs/py310/lib/python3.10/site-packages/torchkbnufft/_nufft/interp.py:376, in table_interp(image, omega, tables, n_shift, numpoints, table_oversamp, offsets, min_kspace_per_fork)
374 if USING_OMP and image.device == torch.device("cpu"):
375 torch.set_num_threads(threads_per_fork)
--> 376 kdat = table_interp_fork_over_batchdim(
377 image, omega, tables, n_shift, numpoints, table_oversamp, offsets, num_forks
378 )
379 if USING_OMP and image.device == torch.device("cpu"):
380 torch.set_num_threads(num_threads)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "/home/CODE1/320171775/miniforge3/envs/py310/lib/python3.10/site-packages/torchkbnufft/_nufft/interp.py", line 265, in table_interp_fork_over_batchdim
# collect the results
return torch.cat([torch.jit.wait(future) for future in futures])
~~~~~~~~~~~~~~ <--- HERE
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: MALFORMED INPUT: Index out of bounds in check_bounds. Index: 511; bounds: [0, 1). - ret_val(dtype=int64_t, sizes=[1], strides=[1])
Hi,
as far as I can see, there is no support for real-valued input arrays?
However, with rfft
we could save a factor of 2 on computation for those cases by using rfft
.
Are there some issues with that?
Best,
Felix
Thanks for the great effort of bringing NUFFT to Pytorch!
The current version of the code is geared towards image analysis, and MRI specifically.
As such, it is hard to grasp the meaning of arguments and their names from the examples.
It would be great to have a basic 1D example of forward/inverse NUFFT-I/NUFFT-II to broaden user
audience
Hi,
thanks for the great package!
Is it right that both nufft and adjoint nufft are only differentiable w.r.t. the input image/k-space but not w.r.t. the coordinates in the input trajectory?
Is it feasible to add gradients w.r.t. to the coordinates as well?
Best,
Tobit
Hello,
I am aware this is a very specific question, but I will try, and if you think this is not the setting for such a discussion feel free to lock the issue.
So, I am trying to use this library to reconstruct real MRI data acquired on a Siemens mMR scanner.
For that purpose I am able to use 2 different data export formats:
Decoding the .dat file, and writing the trajectory myself as a python function worked for a radial stack-of-stars sequence, and I am able to get nice images using your library.
For a SpiralVIBE UTE, instead, I was trying to decode those cfl files, as I don't exactly know how to code the trajectory myself. Decoding those files give me both the kspace data and the trajectory.
However, the shape of the arrays I get from those .cfl files is most likely not suited for what tkbn expects me to feed to the nufft operator and needs to be tweaked. The issue is that I am failing to understand how to do it in the correct way.
The original shape of those arrays is:
without any reshape or transposition, I always get an error saying: ValueError: If omega has batch dim, omega batch dimension must match data
I have tried all sorts of transpositions with no success. Even trying to unsqueeze the kspace array gives the same error.
The only solution that 'works' is when I unravel both arrays to:
But even in this case, the images I get inverting the kspace are just a 'point' in the center. of the FOV.
I was wondering if you had tried working with a similar type of 3D spiral trajectory and could me any tip about how to organize those arrays before using them as input for the nufft operator.
Thank you!
We should have some parameter that the user can pass that controls the number of threads used for a NUFFT. Presently we discover the number of threads on the system here. What this feature proposes is to pass in num_threads
as a parameter. By default it can be None
, which would revert to the current behavior.
This may address issues with using NUFFTs in dataloaders with multiple workers as seen in #73.
I got these warnings with the version of torchkbnufft
that I installed through pip3
by following:
sudo pip3 install torchkbnufft
/usr/local/lib/python3.8/dist-packages/odak-0.1.5-py3.8.egg/odak/wave/classical.py:82: UserWarning: Casting complex values to real discards the imaginary part (Triggered internally at /pytorch/aten/src/ATen/native/Copy.cpp:162.)
H = torch.tensor(H).to(dtype).unsqueeze(0).unsqueeze(0)
/usr/local/lib/python3.8/dist-packages/torchkbnufft/kbnufft.py:109: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)
torch.tensor(np.imag(self.scaling_coef))
/usr/local/lib/python3.8/dist-packages/torchkbnufft/nufft/fft_functions.py:46: UserWarning: The function torch.fft is deprecated and will be removed in PyTorch 1.8. Use the new torch.fft module functions, instead, by importing torch.fft and calling torch.fft.fft or torch.fft.fftn. (Triggered internally at /pytorch/aten/src/ATen/native/SpectralOps.cpp:567.)
x = torch.fft(x, grid_size.numel())
Hello everyone.
Since I built this repository, I've learned about Python and would like to make a number of changes to the backend of torchkbnufft
. Among other things, I'd like to minimize the use of high-level Python constructs like dicts and make the backend functions scriptable by torch.jit.script
. This will likely involve several backwards compatibility-breaking changes. I'll try to keep track of progress in this Issue.
Hey Matt, I seem to be consistently getting an error when attempting to call "calc_toeplitz_kernel" with 'omega' with dimension (N, 3, klength) and 'weights' with dimension (N, 1, klength). This seems to occur even when N=1. The error goes away if I simply don't supply the density compensation weights. Any suggestions for how to fix?
Hello @mmuckley ,
Thank you very much for sharing such a great work.
When I try some open-source project, some errors happen, because some functions like MriSenseNufft, AdjMriSenseNufft, AdjKbNufft, ToepSenseNufft, etc., have been deprecated.
If I still want to use the latest version of torchkbnufft, how can I solve this problem?
Thanks a lot.
Best Regards,
kingaza
Hi,
Since I work with cardiac MRI k-space data (golden angle radial acquisition, 2D + time), my application involves a large number of small NUFFTs. Currently, the only way to process the data is to loop over the time dimension because there is no batch dimension in the trajectory tensor (i.e., different trajectories for every 2D slice).
Is there a more efficient solution than just iterating over the batch dimension with diffenrent trajectories? @mmuckley pointed out in #18 that he has an efficient solution for this use case.
Thanks a lot!
I'm wondering if we might get better performance with the package by rewriting the backend to call into the PyTorch C++ API directly. This would bypass the Python interpreter, which gives quite a bit of overhead with indexing, and indexing is usually our slowest or second-slowest operation when doing interpolations. The process is documented fairly well here.
The nice thing about using the PyTorch C++ API is we've basically already written the code in C++. torch.Tensor.index_add_
becomes at::Tensor::index_add_
and so on and so forth, so we could basically keep the code we have in Python and transcribe it to C++. As long as we stay wtihin the PyTorch API we shouldn't have to worry about C++ stuff like memory management, as the PyTorch garbage collector will take care of that for us.
One thing to figure out would be distribution. I think the best thing would be a modification to setup.py
that builds for any target system and then uploads it to PyPI, keeping the current pure-Python as a fallback. That is to say, we should avoid compiling anything on the install system. The user should only download wheels and immediately use it, which is what people expect from this package. We could accomplish this by modifying the distribution config to build and upload wheels for each target system. Conservatively we could build for Mac and Linux targets and have Windows as a stretch goal.
I don't have time to do this in the immediate future, but I could assist and review diffs from anyone that is interested. Feel free to post here or submit PRs if you are.
Hi @mmuckley,
thank you for the great job for non-Cartesian MR reconstruction!
I'm currently working on cardiac MRI with radial acquisition. I found that the computation of density compensation takes much longer time than the nufft
itself. Would it be possible to speed it up? Or any suggestions on that?
Thanks in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.