The following is investigating / follow up on a single GPU facing high memory fragmentation. Causing OOM issues despite clearly having sufficient space.
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 186.00 MiB (GPU 0; 22.13 GiB total capacity; 16.56 GiB already allocated; 164.62 MiB free; 16.78 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "/home/picocreator/rwkv-proj/picocreator-memory-experiment/RWKV-v4wavenet/src/model.py", line 323, in forward
xr = x * self.time_mix_r + xx * (1 - self.time_mix_r)
k = self.key(xk)
k = torch.square(torch.relu(k))
~~~~~~~~~~ <--- HERE
kv = self.value(k)
return (torch.sigmoid(self.receptance(xr)) * kv,
RuntimeError: Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated : 16.66 GiB
Requested : 31.27 MiB
Device limit : 22.13 GiB
Free (according to CUDA): 4.62 MiB
PyTorch limit (set by user-supplied memory fraction)
: 17179869184.00 GiB